WORLD INTELLFCTUAL PROPERTY ORGANIZATION 
International Bureau 




PCT 

INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) 



(51) International Patent Classification 6 : 

C12Q 1/68, C12P 19/34, C07H 21/02, 
21/04 



Al 



(11) Internationa! Publication Number: WO 98/07887 

(43) International Publication Date: 26 February 1998 (26.02.98) 



(21) International Application Number: PCT/US97/ 14892 

(22) International Filing Date: 22 August 1997 (22.08.97) 



(30) Priority Data: 

60/023,438 



23 August 1996 (23.08.96) 



US 



(71) Applicant: THE REGENTS OF THE UNIVERSITY OF 

CALIFORNIA [US/US]; 22nd Moor, 300 Lakeside Drive, 
Oakland, CA 94612-3550 (US). 

(72) Inventors: FREIMER, Nelson, B.; 630 29th Street, San Fran- 

cisco, CA 94131 (US). LEON, Pedro; Centro de Invcstiga- 
ciones Biologia, University of Costa Rica, P.O. Box 2060, 
San Jose (CR). REUS, Victor, I.; 1214 Third Avenue, 
San Francisco, CA 94122 (US). SANDKUJIL, Lodewijk, 
A.; Voorstraat 27A, NL-2611 JK Delft (NL). McINNES, 
Lynne, Allison; 1599 Shrader Street, San Francisco, CA 
941 17 (US). SERVICE, Susan, K.; 816 Maher Road, Wat- 
sonville, CA 95076 (US). 

(74) Agents: HUGHES, Melya, J.; Cooley Godward LLP, 3000 El 
Camino Real, Five Palo Alto Square, Palo Alto, CA 94306- 
2155 (US) et al. 



(81) Designated States: AL, AM, AT. AU, AZ, BA, BB, BG, BR, 
BY, CA, CH, CN, CU, CZ, DE, DK, EE, ES, FI, GB, GE, 
GH, HU, IL, IS, JP, KE, KG, KP, KR, KZ, LC, LK, LR T 
LS, LT, LU, LV, MD, MG, MK, MN, MW, MX, NO, NZ, 
PL, PT, RO, RU, SD, SE, SG, SI, SK, SL, TJ, TM, TR, 
TT. UA, UG, UZ, VN, YU, ZW, ARIPO patent (GH, KE, 
LS, MW. SD, SZ, UG, ZW), Eurasian patent (AM, AZ, BY, 
KG, KZ, MD. RU, TJ, TM), European patent (AT, BE, CH, 
DE, DK, ES, FI, FR, GB, GR, IE, IT, LU, MC. NL, PT, 
SE), OAPI patent (BF, BJ, CF, CG, CI, CM, GA, GN, ML, 
MR, NE, SN, TD, TG). 



Published 

With international search report. 

Before the expiration of the time limit for amending the 
claims and to be republished in the event of the receipt of 
amendments. 



(54) Title: METHODS FOR TREATING BIPOLAR MOOD DISORDER ASSOCIATED WITH MARKERS ON CHROMOSOMES 18p 



(57) Abstract 

The present invention is directed to methods of detecting the presence of a bipolar mood disorder susceptibility locus in an individual, 
comprising analyzing a sample of DNA for the presence of a DNA polymorphism on the short arm of chromosome 18 between the telomere 
and D18S48I, wherein the DNA polymorphism is associated with a form of bipolar mood disorder. The invention for the first time provides 
strong evidence of a susceptibility gene for bipolar mood disorder that is located in the terminal 5 cM region of the short arm of chromosome 
18. The disclosure describes the use of linkage analysis and genetic markers in this 5 cM region to fine map the region and the use of 
genetic markers to genetically diagnose (genotype) bipolar mood disorder in individuals, to confirm phenotypic diagnoses of bipolar mood 
disorder, to determine appropriate treatments for patients with particular genotypic subtypes. Isolated polynucleotides useful for genetic 
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1 METHODS FOR TREATING BIPOLAR MOOD DISORDER 

2 ASSOCIATED WITH MARKERS ON CHROMOSOME 18p 

3 
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7 invention. 
8 

9 INTRODUCTION 
10 

11 Background 
12 

13 Bipolar Mood Disorder (BP) 

14 Manic-depressive illness, or bipolar mood disorder (BP), is characterized by episodes 



15 of elevated mood (mania) and depression and is among the most prevalent and potentially 

16 devastating of psychiatric syndromes. The most severe and clinically distinctive forms of BP 

17 are BP-I (severe bipolar mood disorder) and SAD-M (schizoaffective disorder manic type), 

18 and are characterized by at least one full episode of mania, with or without episodes of major 

19 depression (defined by lowered mood, or depression, with associated disturbances in 

20 rhythmic behaviors such as sleeping, eating, and sexual activity). A milder form of BP is 

21 BP-II, bipolar mood disorder with hypomania and major depression. BP-I often co- 

22 segregates in families with more etiologically heterogeneous syndromes, such as unipolar 

23 major depressive disorder (MDD), which is a more broadly defined phenotype. See 

24 Mclnnes, L.A. and Freimer, N.B., Mapping genes for psychiatric disorders and behavioral 

25 traits, Curr. Opin. in Genet, and Develop., 5:376-381 (1995). 
26 
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Treatment of Individuals with Bipolar Mood Disorder 
An estimated 2-3 million people in the United States are affected by BP-I Currently 
individuals are typically evaluated for bipolar mood disorder using the clinical criteria set ' 
forth m the most current version of the American Psychiatric Association's Di^o^nd 
Statical Manual of Mem.l Diso rders, (DSM). Many drugs have been used to treat 
6 individuals diagnosed with bipolar mood disorder, including lithium salts, carbamazepine and 
/ va.pro.c acid. However, none of the currently available drugs is able to treat every 

8 individual diagnosed with severe BP-I (termed BP-I) and drug treatments are effective in only 

9 approximately 60-70% of individuals diagnosed with BP-I. Moreover, it is currently 
10 impossible to predict which drug treatments will be effective in particular BP-I affected 

individuals. Commonly, upon diagnosis affected individuals are prescribed one drug after 
another until one is found to be effective. Early prescription of an effective drug treatment is 
cnucal for several reasons, including the avoidance of extremely dangerous manic episodes 
14 and the risk of progressive deterioration if effective treatments are not found. Also. 

appropriate treatment may prevent depressive episodes in BP-I individuals; these episodes are 
also dangerous and are characterized by a high suicide rate. The high prevalence of the 
disorder, together with frequent occurrence of hospitalizations, psychosocial impairment 
suicide and substance abuse, has made BP-I a major public health concern. 



11 
12 
13 



15 

16 
17 
18 
19 



20 Genetic Basis for Bipolar Mood Disorder 

21 Mapping genes for common diseases believed to be caused by multiple genes, such as 

22 BP-I, may be complicated by the typically imprecise definition of phenotypes, by etiologic 

23 heterogeneity and by uncertainty about the mode of genetic transmission of the disease trait 

24 W.th psychiatric disorders there is even greater ambiguity in d.stinguishing between 

25 individuals who likely carry an affected genotype from those who are genetically unaffected 

26 For example, one can define an affected phenotype for BP by including one or more of the 

27 broad grouping of diagnostic classifications that constitute the mood disorders: BP-I, SAD- 

28 M, MDD, and BP-II. 

29 Thus, one of the greatest difficulties facing psychiatric geneticists is uncertainty 

30 regarding the validity of phenotype designations, since clinical diagnoses are based solely on 
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1 clinical observation and subjective reports. Also, with complex traits such as psychiatric 

2 disorders, it is difficult to map the trait-causing genes genetically because: (1) the BP-I 

3 phenotype doesn't exhibit classic Mendelian recessive or dominant inheritance patterns 

4 attributable to a single genetic locus, (2) there may be incomplete penetrance i.e., individuals 

5 who inherit a predisposing allele may not manifest the disease; (3) the phenocopy 

6 phenomenon may occur, i.e., individuals who do not inherit a predisposing allele may 

7 nevertheless develop the disease due to environmental or random causes; (4) genetic 

8 heterogeneity may exist, in which case mutations in any one of several genes may result in 

9 identical phenotypes. 

10 The existence of one or more major genes associated with BP-I and with a clinically 

1 1 similar diagnostic category, SAD-M (schizoaffective disorder manic subtype), is supported by 

12 segregation analyses and twin studies (Bertelson et al., 1977; Freimer and Reus, 1992; Pauls 

13 et al., 1992). However, efforts to identify the chromosomal location of BP-I genes have 

14 yielded disappointing results in that reports of linkage between BP-I and markers on 

15 chromosomes X and 11 could not be independently replicated nor confirmed in the re- 

16 analyses of the original pedigrees (Baron et al., 1987; Egeland et al., 1987; Kelsoe et al., 

17 1989; Baron et al., 1993). The possible localization of BP genes on chromosomes 18 

18 (pericentromeric region) and 21q has been suggested, but in both cases the proposed 

19 candidate region is not well defined and there is equivocal support for either location 

20 (Berrettini et al. (1994) Proc. Natl. Acad. Sci. USA, 91, 5918-5921, Murray, J.C., et al. 

21 (1994) Science 265, 2049-2054; Pauls et al., Am. J. Hum. Genet. 57:636-643 (1995); Maier 

22 et al.. Psych. Res. 59:7-15 (1995); Straub et al.. Nature Genet., 8:291-296 (1994)). Recent 

23 investigations have led to the isolation of chromosome 18-specific brain transcripts which 

24 have been suggested to be positional candidates for bipolar disorder (Yoshikawa et al., Am. 

25 J. Med. Gen. 74, 140-149 (1997)). 

26 Despite abundant evidence that BP has a major genetic component, linkage studies 

27 have not yet succeeded in definitively localizing a BP gene. This is mainly because mapping 

28 studies of psychiatric disorders have generally been conducted under a paradigm appropriate 

29 for mapping genes for simple Mendelian disorders, namely, using linkage analysis in the 

30 expectation of finding high lod scores that definitively signpost the location of disease genes. 

3. 
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The follow up to early BP linkage studies, however, showed that even extremely high lod 
scores at a single location can be false positives. See Egeland. et al.. Nature 325 783-787 
(1987); Baron et al.. Nature 326:289-292 (1987); Kelsoe et al.. Nature. 342:238-243 (1989) 
and Baron et al.. Nature Genet. 3:49-55 (1993). These earlier studtes used largely 
uninformative markers and did not use stringent criteria for identifying affected individuals. 



1 

2 
3 
4 
5 
6 

7 Linkage Disequilibrium Analysis 

Linkage d.sequilibrium (LD) analysis is a powerful too. for mapp.ng dtsease genes 
and may be particularly useful for investigating complex traits. LD mapping is based on the 
followmg expectations: for any two members of a population, it is expected that 
recombination events occurring over several generations will have shuffled their genomes so 
that they share little in common with their ancestors. However, if these mdividuals are 
affected with a dtsease inhented from a common ancestor, the gene responsible for the 
d.sease and the markers that unmed.ately surround it will likely be inherited without change, 
or IBD ("identical by descent"), from that ancestor. The s.ze of the reg 10 ns that remain ' 
shared (i.e. IBD) are inversely propomonal to the number of generations separaung the 
affected individuals and their common ancestor. Thus, "old" populations are su.table for fine 
scale mapptng and recently founded ones are appropnate for us.ng LD to roughly locate 
disease genes more approximately (Houwen et al.. 1994. in particular Fig. 3 and 
accompanymg text). Because isolated populates typically have had a small number of 
founders, they are particularly suitable for LD approaches, as indicated by several successful 

22 LD studies conducted in Finland (de la Chapelle. 1993). 

23 LD analysis has been used in several positional cloning efforts (Kerem et al. 1989 
MacDonald et al., 1992; Petrukhin et al.. 1993; Hastbacka et al.. 1992 and 1994), but in 
each case the initial localization had been achieved using conventional linkage methods. 
Positional cloning is the isolation of a gene solely on the basis of its chromosomal location, 
without regard to its biochemical function. Lander and Botstein (1986) proposed that LD 
mapping could be used to screen the human genome for disease loci, without conventional 
linkage analyses. This approach was not practical until a set of mapped markers covering 
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1 the genome became available (Weissenbach et al., 1992). The feasibility of genome 

2 screening using LD mapping is now demonstrated by the applicants. 

3 Identification of the chromosomal location of a gene responsible for causing severe 

4 bipolar mood disorder can facilitate diagnosis, treatment and genetic counseling of 

5 individuals in affected families. 

6 Due to the severity of the disorder and the limitations of a purely phenorypic 

7 diagnosis of BP-I, there is a tremendous need to subtype individuals with BP-I genetically to 

8 confirm clinical diagnoses and to determine appropriate therapies based on their genotypic 

9 subtype. 
10 

11 SUMMARY OF THE INVENTION 

12 The present invention comprises using genetic linkage and haplotype analysis to 



13 identify an individual having a bipolar mood disorder gene on the short arm of chromosome 

14 18. In addition, the present invention provides markers linked to a gene responsible for 

15 susceptibility to bipolar mood disorder that will enable researchers to focus future analysis on 

16 that small chromosomal region and will accelerate the sequencing of a bipolar mood disorder 

17 gene located at 18p. 

18 The present invention provides, for the first time, a localization of a BP-I 

19 susceptibility locus to a 300 to 500 kb region of the short arm of chromosome 18. 

20 The present invention is directed to methods of detecting the presence of a bipolar 

21 mood disorder susceptibility locus in an individual, comprising analyzing a sample of DNA 

22 for the presence of a DNA polymorphism on the short arm of chromosome 18 between 

23 SAVA5 and ga203, wherein the DNA polymorphism is associated with a form of bipolar 

24 mood disorder. The invention includes the use of genetic markers in the roughly 500 kb 

25 region between the SAVA5 locus and the ga203 locus, inclusive, to diagnose bipolar mood 

26 disorder genetically in individuals and to confirm phenorypic diagnoses of bipolar mood 

27 disorder. Preferably, the sample of DNA is analyzed for the presence of a DNA 

28 polymorphism on the short arm of chromosome 18 in the roughly 300 kb region between 

29 D18S1140 and W3422. 
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1 In a further embodiment, the invention provides methods of classifying subtypes of 

2 bipolar mood disorder by identifying one of more DNA polymorphisms located within the 
.3 500 kb region between SAVA5 and ga203 loci, inclusive, on the short arm of chromosome 

4 18 and analyzing DNA samples from individuals phenotypically diagnosed with bipolar mood 

5 disorder for the presence or absence of one or more of said DNA polymorphisms. 

6 Preferably, the sample of DNA is analyzed for the presence or absence of one or more of 

7 said DNA polymorphisms in the roughly 300 kb region between D18S1 140 and W3422 on 

8 the short arm of chromosome 18. 

9 In yet a further embodiment, the methods of the invention include a method of 

10 treating an individual diagnosed with bipolar mood disorder comprising identifying one or 

1 1 more DNA polymorphisms located within the 500 kb region of chromosome 18 between 

12 SAVA5 and ga203, analyzing DNA samples from individuals phenotypically diagnosed with 

13 bipolar mood disorder for the presence or absence of one or more of the DNA 

14 polymorphisms, and selecting a treatment plan that is most effective for individuals having a 

15 particular genotype within the 500 kb region of chromosome 18 between SAVA5 and ga203. 

16 Preferably, the sample of DNA is analyzed for the presence or absence of one or more DNA 

17 polymorphisms in the roughly. 300 kb region between D18S1140 and W3422 on the short 

18 arm of chromosome 18. 
19 

20 BRIEF DESCRIPTION OF THE DRAWINGS 

21 FIG. 1 is a pedigree chart showing two families, CR001 and CR0O4. Affected 

22 individuals are denoted by black symbols, deceased individuals by a diagonal slash. A 

23 schematic of each individual's hapiotype (where available) is shown below the ID number. 

24 Recombinations are denoted by "-x M ; consanguineous marriages by a double bar, and the 

25 conserved hapiotype as black shading within the hapiotype bars. The larger conserved region 

26 for CR004 is stippled, the larger conserved region for CR001 is indicated by a dashed 

27 outline. An T underneath the hapiotype bars indicates inferred hapiotype. A indicates 

28 phase is uncertain. The connection between CR001 and CR004, dating to an 18th Century 

29 founding couple, is indicated by the dashed lines joining individuals III-6 and 1-4. 
30 
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1 FIG. 2 is a table of lod scores for markers covering the entire human genome thai 

2 exceeded the arbitrary coverage thresholds. Lod scores are shown for two markers on 

3 chromosome 18: D18S59 and D18S1105. 
4 

5 FIG. 3 depicts the extent of marker coverage used in the pedigree genome screening 



6 study for each chromosome. Coverage is defined as regions for which a lod score of at least 

7 1.6 would have been detected (in the combined data set) for markers truly linked to BP-I 

8 under the model employed. Areas that remain uncovered (at this threshold) are unshaded. 

9 Markers for which lod scores were obtained that exceeded the empirically determined 

10 coverage thresholds in CR001, CR004, or the combined data set, are shown at their 

1 1 approximate chromosomal location. The symbols to the right of the chromosome indicate the 

12 thresholds exceeded at that marker: a circle signifies that the lod score at a marker exceeded 

13 the threshold of 0.8 in CR001, a diamond signifies that the lod score exceeded the threshold 

14 of 1.2 in CR004, and a star signifies that the lod score exceeded the threshold of 1.6 in the 

15 combined data set. 
16 

17 FIGS. 4A and 4B depicts the Lod score for the maximum likelihood estimate of theta 

18 in the combined sample for the 473 microsatellite markers typed in the pedigree genome 

19 screen. The MLEs of theta were appointed to the following categories: theta < 0.10; 0.10 

20 < theta < 0.40; theta >0.40. Note that the scale for the x-axis (distance from pter) 

21 changes with chromosomes. 
22 



23 FIG. 5 is a portion of an integrated map of the 5 cM 18pter region of chromosome 

24 18. 
25 

26 FIGS. 6A, 6B and 6C are a list of markers on chromosome 18, with map positions 

27 noted. 
28 

29 FIG. 7 describes 18p allele frequencies for disease chromosomes (aff 105) versus 

30 nontransmitted. chromosomes (ntrans) and samples from a control population of Costa Rican 
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1 students and their parents (control). The name of each marker used in this study is indicated 

2 on the left. The second column of numbers refers to allele length in base pairs. 



3 
4 
5 



FIG. 8 depicts haplotype analysis of individuals affected with BP-I. The column 
labelled 18p refers to the patient identifier, and each patient identifier is repeated with 2 rows 

6 to indicate allele results with each of the patient's two copies of chromosome 18. The 

7 columns labelled "PANR" and "MANR" refer to the paternal' and maternal identifiers, 
respectively, associated with the particular patient, other than 0. 1 and 2, which ind 1C ate that 
parental samples were not available. The column headings to the right of "PANR" and 
"MANR" columns represent names of specific markers in the 18p region that were used in 
the haplotype analysis. The markers are listed in the order they appear on chromosome 18. 
The allele length (in base pairs) is indicated under the column heading each marker for a 

13 particular patient. In the column to the immediate right of each marker column, a "1 " 

14 indicates that the phase is known, i.e.. that it is known whether a particular allele is inherited 

15 from the paternal or maternal chromosome, and a "0" indicates that the phase is not 

16 definitely known. The shaded horizontal bars depict haplorypes of at least three markers 
which include a 154 allele length at D18S59. other than patients 218. 225. 232. 234. 311. 
314 and 458, where the suppled region depicts small sections that do not have the 154 allele 
at D18S59. The hatched regions depict uncertainty as to whether the individual has the 
affected haplotype, as the phase is not known with certainty. In addition, the presence of an 
allele length of 232 (or 234) with marker ta201 is thought to result from a highly mutable 
allele and may not be distinct from the 230 allele. Similarly, the 202 allele at ca212 may not 

23 be distinct from the 200 allele at ca212. Patients 246. 247, 248. 311. 316. 367. 384, 501. 

24 531. 587. 536. 684. 667 and 669 exhibit a 242, 244, 250. 252 or 214 allele at marker ta201 
which indicates a potential marker location. Patients 488, 435 and 236 exhibit haplorypes 



8 
9 
10 
1 1 
12 



17 
18 
19 
20 
21 



25 



26 that are distinct from the pedigrees that were analyzed 
27 
28 



29 
30 



FIG. 9 depicts haplotype analysis of nontransmitted chromosomes from parents of 
individuals affected with BP-I. The labels "ERSN" and "KID" refer to the parental and 
patient identifiers, respectively. As above, allele length is provided in base pairs below each 
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1 marker with an indication as to whether phase was known (1) or not known (0) given to the 

2 right of these values. The markers, shading and allele characteristics described for Figure 8 

3 also apply to this figure. 
4 

5 

6 FIG. 10 depicts haplotype analysis of control samples obtained from an unscreened 

7 population of students of the University of Costa Rica and their parents representing the 

8 general population. Identifiers are provided in the column headed "com", allele length and 

9 phase determination given in the remainder of the table. The markers, shading and allele 

10 characteristics described for Figure 8 also apply to this figure. Complete data for all 

11 markers are not given as indicated by blank boxes, or the terms "miss" or "missing". 



12 

13 FIG. 11 depicts Ancestral Haplotype Reconstruction results in disease chromosomes. 

14 

15 DESCRIPTION OF SPECIFIC EMBODIMENTS 

16 The recent availability of highly polymorphic, genetically mapped markers covering 



17 the human genome (Weissenbach, J., et al. (1996) Nature 359, 794-801. Murray, J.C., et al. 

18 (1994) Science 265. 2049-2054, Gyapay, G., et al. (1994) Nature Genet 7.246-339) has 

19 allowed the development of a multi-stage paradigm for mapping genes for complex traits. In 

20 the first stages, complete genome screening (e.g. through lod score analysis) is used to 

21 identify possible localizations for disease genes. Subsequently, the regions highlighted by the 

22 screening study are more intensively investigated to confirm the initial localizations and 

23 delineate clear candidate regions. Finally, fine mapping methods (such as haplotype or 

24 linkage disequilibrium (LD) analysis) or candidate gene approaches are used for positional 

25 cloning of disease genes. 

26 Our genome screening study for BP employed the following strategies. Unlike 

27 previous genetic studies of BP, only those individuals with the most severe and clinically 

28 distinctive forms of BP (BP-I and schizoaffective disorder manic type, SAD-M) were 

29 considered as affected, rather than including those diagnosed with a milder form of BP (BP- 

30 II) or with unipolar major depressive disorder (MDD). Two large pedigrees (CR001 and 
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CR004) were selected from a genetically homogeneous population, that of the Central Valley 
of Costa Rica (as described in Escamilla, M.A.. et al.. (1996) Neuropsychiat.. Genet. 67. 
244-253. and in Freimer. N.B., et al. (1996) Neuropsychiat. Genet. 67. 254-263, both 
incorporated by reference herein). The entire human genome was screened for linkage using 
mapped microsatellite markers and a model for genetic analysis in which most of the linkage 
information was derived from affected individuals. The goal of this stringent linkage 
analysis was to identify all regions potentially harboring major genes for BP-I in the study 
population. Empirically determined lod score thresholds (using linkage simulation analyses) 
9 were derived, to suggest regions worthy of further investigation. 

10 Identification of all suggestive regions and weighing the relative importance of 

1 1 findings required complete screening of the genome. The coverage approach was developed 

12 to gauge the progress of this effort. Conventionally, the thoroughness of genome screening 

13 is evaluated by excluding genome regions from linkage under given genetic models. This 

14 approach, which is highly sensitive to misspecification of genetic models, may be poorly 

15 suited for genome screening studies of complex traits: it is tied to the expectation of finding 

16 linkage at a single locus and demonstrating absence of linkage at all other locations in the 

17 genome. Additionally, exclusion analyses do not differentiate between genome regions 

18 where linkage is not excluded because markers are uninformative in the study population 

19 from those in which the genotype data are simply ambiguous. In contrast, the coverage 

20 approach is designed for studies aimed at genome screening rather than for studies where the 

21 goal is to demonstrate a single unequivocal linkage finding, and it provides explicit data 

22 regarding the informativeness of markers in the study pedigrees. Its use lessens the 

23 possibility that one would prematurely dismiss a given genome region as being unpromising 

24 for further study. 

25 Because the exact genetic length of chromosomes is not clearly established, it is 

26 impossible to be certain that one has screened the entire genome. Although we report 

27 coverage of about 94% of the genome (under the 90%) dominant model) at the thresholds 

28 described above, this probably represents an underestimate. The remaining coverage gaps in 

29 our study occur predominantly at or near telomeres; as the upper bound estimates for the 
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1 length of each chromosome were used, it is likely that the actual coverage gaps in these 

2 regions are smaller than our conservative assessment, 

3 The presence of consistently positive tod scores over a given region was considered to 

4 be of greater significance than isolated peak lod scores. Such clustering suggests true co- 

5 segregation of markers and phenotypes (i.e. alleles are shared identically by descent rather 

6 than identically by state) and is more readily observed in analyses of a few large pedigrees 

7 (as in our study) than in examination of several smaller families. The data presented herein 

8 indicates clustering of positive lod scores in the region of the telomere of 18p. 

9 The genome screen was conducted in two stages. The Stage 1 screen identified areas 

10 suggestive of linkage, so that those areas could be saturated with available markers, and so 

11 that regions, referred to as 'coverage gaps\ could be pinpointed where markers were 

12 insufficiently informative in our sample to detect evidence of linkage. The Stage II screen 

13 followed up on regions flanking each marker that yielded peak lod scores approximately 

14 equal to or greater than the thresholds used for the coverage calculations, which were 

15 deemed regions of interest, and filled in coverage gaps. The results of the complete genome 

16 screen (Stages I and II) using 473 markers is described below. 

17 In addition, linkage disequilibrium analysis of an independently collected sample of 48 

18 unrelated BP-I patients was initially conducted. These patients were from the same ancestral 

19 population as the patients in the CR001 and CR0O4 pedigrees. The LD analysis was 

20 conducted with markers on the short arm of chromosome 18 (18p), in a 5 centimorgan (cM) 

21 region ( w 5 cM 18pter region") extending from the end of the 18p telomere to a distance of 5 

22 cM along the short arm of chromosome 18 (18p). The LD analysis gave evidence of LD in 

23 this region, particularly at marker D18S59 and also at D18S476. LD analysis of further BP- 

24 I patients from the CRCV with markers in this 5 cM 18pter region was conducted to confirm 

25 and fine map a BP-I gene in this region. This approach, using additional BP-I patients from 

26 this CRCV population and additional markers identifies the region of maximum LD and can 

27 precisely localize a BP-I susceptibility gene. 

28 Fine mapping of 5 cM 18pter region resulted in the identification of two DNA 

29 markers (D18S1140 and W3422) defining the boundaries of BP-I as approximately 300 kb, 

30 thus allowing a systematic search for the BP-I gene(s). 
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1 A conservative approach to linkage analysis was used in that almost all of the 

2 information for linkage is derived from individuals with a severe, narrowly defined 

3 phenotypc. While this approach made it very unlikely that lod scores greater than 

4 conventional thresholds of statistical significance (e.g. >3) would be obtained, it provided 

5 confidence in the robustness of the most suggestive findings. 

6 Direct cDNA selection can be used to isolate segments of expressed DNA from the 

7 300 kb region between D18S1140 and W3422 (ML Lovett, J. Kere, L.M. Hinton, Proc. 

8 Natl. Acad, Sci. USA 88 9628-9632 (1991); Y.-S. Jou et al., Genomics 24 410-413 (1994)). 

9 By using bacterial artificial chromosomes (BAC) (e.g., commercially available from 

10 Research Genetics Inc. Huntsville, Alabama), a group of cDNAs can be identified, and 

11 hybridization and PCR-amplification experiments can be used to determine if these cDNA 

12 segments are derived from the 300 kb region. 

13 The cDNAs can then be used to determine whether specific sequences are expressed 

14 at lower levels (or not at all) in affected individuals compared to non-carrier individuals. 

15 Measurement of mRNA levels in lymphoblastoid ceil lines can be used as an initial screen. 

16 The cell lines are prepared by drawing blood from individuals, transforming the lymphoblasts 

17 with EBV and growing the immortalized cells in culture. Total RNA and DNA are extracted 

18 from the cultured human lymphoblastoid cell lines. Northern blot hybridization is used to 

19 determine reduced levels of a specific sequence compared to levels from an unaffected, non- 
20 carrier individual as a result of mutations in the BP-I gene on the chromosomes from these 

21 affected individuals which results in decreased levels of mature mRNA and play a primary 

22 role in BP-I. Thus, alterations in gene sequences in affected individuals can be determined. 

23 The polymerase chain reaction (PCR) is used to amplify the gene and to determine its 

24 sequence from affected individuals. Sequence comparison with unaffected, non-carrier 

25 individuals is carried out to identify polymorphisms in the gene sequence that are responsible 

26 for BP-I. 

27 The identification of the biochemical defect that causes BP-I provides a basis for 

28 treatments for this disease. In addition, knowledge that certain mutations in the gene are 

29 responsible for the disease allows mutation detection tests to be used as a definitive diagnosis 

30 for BP-I. 
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1 Thus, the present invention allows the isolation of a nucleic acid molecule that can be 

2 used in the identification of the presence (or absence) of a mutation in the BP-1 gene a human 

3 and thus can be used in the diagnosis of BP-l or in the genetic counseling of individuals, for 

4 example those with a family history of BP-I (although the general population can be screened 

5 as well). In particular, it should be noted that any mutation in the BP-I gene away from the 

6 normal gene sequence is an indication of a potential genetic flaw; even so-called "silent" 

7 mutations that do not encode a different amino acid at the location of the mutation are 

8 potential disease mutations, since such mutations can introduce into (or remove from) the 

9 gene an untranslated genetic signal that interferes with the transcription or translation of the 

10 gene. Thus, advice can be given to a patient concerning the potential for transmission of BP- 

11 I if any mutation is present. While an offspring with the mutation in question may or may 

12 not have symptoms of BP-I, patient care and monitoring can be selected that will be 

13 appropriate for the potential presence of the disease; such additional care and/or monitoring 

14 can be eliminated (along with the concurrent costs) if there are no differences from the 

15 normal gene sequence. As additional information (if any) becomes available (e.g., that a 

16 given silent mutation or conservative replacement mutation does or does not result in BP-I), 

17 the advice given for a particular mutation may change. However, the change in advice given 

18 does not alter the initial determination of the presence or absence of mutations in the gene 

19 causing BP-I. 

20 Generally, mutations are identified in the human gene for use in a method of detecting 

21 the presence of a genetic defect that causes or may cause BP-I, or that can or may transmit 

22 BP-I to an offspring of the human. Initially, the practitioner will be looking simply for 

23 differences from the sequence identified as being normal and not associated with disease, 

24 since any deviation from this sequence has the potential of causing disease, which is a 

25 sufficient basis for initial diagnosis, particularly if the different (but still unconfirmed) gene 

26 is found in a person with a family history of BP-I. As specific mutations are identified as 

27 being positively correlated with BP-I (or its absence), practitioners will in some cases focus 

28 on identifying one or more specific mutations of the gene that changes the sequence of a 

29 protein product of the gene or that results in the gene not being transcribed or translated. 

30 However, simple identification of the presence or absence of any mutation in the gene of a 
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1 patient will continue to be a viable pan of genetic analysis for diagnosis, therapy and 

2 counseling. 

3 The actual technique used to identify the gene or gene mutant is not itself pan of the 

4 practice of the invention. Any of the many techniques to identify gene mutations, whether 

5 now known or later developed, can be used, such as direct sequencing of the gene from 

6 affected individuals, hybridization with specific probes, which includes the technique known 

7 as allele-specific oligonucleotide hybridization, either without amplification or after 

8 amplification of the region being detected, such as by PCR. Other analysis techniques 

9 include single-strand conformation polymorphism (SSCP), restriction fragment length 

10 polymorphism (RFLP), enzymatic mismatch cleavage techniques and transcription/translation 

1 1 analysis. All of these techniques are described in a number of patents and other publications; 

12 see, for example, "Laboratory Protocols for Mutation Detection" (1996) Oxford University 

13 Press, Editor: Ulf Landegrun. 

14 Depending on the patient being tested, different identification techniques can be 

15 selected to achieve panicularly advantageous results. For example, for a group of patients 

16 known to be associated with panicular mutations of the gene, oligonucleotide ligation assays, 

17 " mini-sequencing" or allele-specific oligonucleotide (ASO) hybridization can be used. For 

18 screening of individuals who are not known to be associated with a panicular mutation, 

19 single-strand conformation polymorphism, total sequencing of genetic and/or cDNA and 

20 comparison with standard sequences are preferred. 

21 In many identification techniques, some amplification of the host genomic DNA (or of 

22 messenger RNA) will take place to provide for greater sensitivity of analysis. In such cases 

23 it is not necessary to amplify the entire gene, merely the pan of the gene or the specific 

24 location within the gene that is being detected. Thus, the method of the invention generally 

25 comprises amplification (such as via PCR) of at least a segment of the gene, with the 

26 segment being selected for the panicular analysis being conducted by the diagnostician^ 

27 The patient on whom diagnosis is being carried out can be an adult, as is usually the 

28 case for genetic counseling, or a newborn, or prenatal diagnosis can be carried out on a 

29 fetus. Blood samples are usually used for genetic analysis of adults or newborns (e.g., 



14. 



BNSDOCID: <WO 9807887A1_I_> 



WO 98/07887 



PCT7US97/14892 



1 screening of dried blood on filter paper), while samples for prenatal diagnosis are usually 

2 . obtained by amniocentesis or chorionic villus biopsy. 

3 Prior to the present invention, affected individuals were prescribed one drug after 

4 another until one was found to be effective. As BP-I was diagnosed using clinical criteria. 

5 no correlation between using a particular drug and its efficacy in a given case was observed. 

6 As a result of the present invention, BP-I subtypes can be diagnosed at the molecular level 

7 and effective treatment predicted. 

8 For example, lithium salts, carbamazepine and valproic acid have all been prescribed 

9 for BP-I affected individuals with serendipitous results. An individual can now be diagnosed 

10 with bipolar mood disorder by analyzing genetic material from that individual for the 

11 presence or absence of one or more nucleic acid mutations as described above. As a result 

12 of this diagnosis at the molecular level, an effective treatment can be determined by 

13 collecting data to obtain a statistically significant correlation of a particular treatment with 

14 the different subtypes of BP-L Thus, the practitioner is able to select a specific drug for the 

15 treatment of a particular sub-type of BP-I and does not merely rely on trial and error. 

16 Alternatively, the full-length normal genes for BP-I from humans, as well as shorter 

17 genes that produce functional proteins, can be used to correct BP-I in a human patient by 

18 supplying to the human an effective amount of a gene product of the human gene, either by 

19 gene therapy or by in vitro production of the protein followed by administration of the 

20 protein. It should be recognized that the various techniques for administering genetic 

21 materials or gene products are well known and are not themselves part of the invention. The 

22 invention merely involves supplying the genetic materials or proteins identified as a result of 

23 the present invention in place of the genetic materials or proteins previously administered. 

24 For example, techniques for transforming cells to produce gene products are described in 

25 U.S. Patent No. 5,283,185 entitled "Method for Delivering Nucleic Acid into Cells," as well 

26 as in numerous scientific articles, such as Feigner et al., "Lipofection: A Highly Efficient, 

27 Lipid-Mediated DNA-Transfection Procedure/ Proc. Natl. Acad. Sci. U.S.A., 84, 7413- 

28 7417 (1987); techniques for in vivo protein production are described in, for example, 

29 Mueller et al., "Laboratory Methods - Efficient Transfection and Expression of Heterologous 

30 Genes in PC12 Cells," DNA and Cell Biol., 9(3), 221-229 (1990). 
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1 Administration of proteins and other molecules to overcome a deficiency disease is 

2 well known (e.g., administration of insulin to correct for high blood sugar in diabetes) that 

3 further discussion of this technique is not necessary. Some modification of existing 

4 techniques may be required for particular applications, but those modifications are within the 

5 skill level of the ordinary practitioner using existing knowledge and the guidance provided in 

6 this specification. 

7 The invention now being generally described, the following examples are provided for 

8 purposes of illustration only and are not to be considered to limit the invention. 
9 

10 

11 EXAMPLES 

12 Pedigrees 

13 Two independently ascertained Costa Rican pedigrees (CR001 and CR004) were 

14 chosen because they contained a high density of individuals with BP-I and because their 

15 ancestry could be traced to the founding population of the Central Valley of Costa Rica. The 

16 current population of the Central Valley (consisting of about two million people) is 

17 predominantly descended from a small number of Spanish and Amerindian founders in the 

18 16th and 17th centuries (Escamilla, M.A., et al., (1996) Neuropsychiat. Genet. 67, 244- 

19 253). Studies of several inherited diseases have confirmed the genetic isolation of this 

20 population (Leon, P., et al. (1992) Proc. Natl. Acad. Sci. USA. 89, 5181-5184; 

21 Uhrhammer, N., et al. (1992) Am. J. Hum. Genet. 57, 103-111). An extensive description 

22 of pedigrees CR001 and CR004 has ben published (Freimer, N.B., et al. (1996) 

23 Neuropsychiat. Genet. 67, 254-263). In the course of the study, two links between these 

24 pedigrees were discovered. However, the families were analyzed separately because these 

25 links were discovered after the simulation analyses were completed and after the genome 

26 screening study had been initiated. 

27 All available adult members of these families were interviewed in Spanish using the 

28 Schedule for Affective Disorders and Schizophrenia Lifetime version (SADS-L) (Endicott, J. 

29 et al, (1978) Arch. Gen. Psych. 35, 837-844). Individuals who received a psychiatric 

30 diagnosis were interviewed again in Spanish by a research psychiatrist using the Diagnostic 
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1 Interview for Genetic Studies (DIGS) (Nurnberger, J.L. et al. (1994) Arch. Gen. Psychiat. 

2 51, 849-859). This recently developed instrument is similar to, but more detailed than 

3 SADS-L. The interviews and medical records were then reviewed by two blinded best 

4 estimators who reached a consensus diagnosis. The diagnostic procedures are described in 

5 detail in Freimer, N.B., et al. (1996) Neuropsychiat. Genet. 67, 254-263 (incorporated by 

6 reference herein). 
7 

8 Unrelated CRCV BP-I Patient Study 

9 BP localizations obtained through the CRCV pedigree studies were confirmed by 

10 genotyping an independently collected sample of 48 unrelated BP-I patients from the CRCV. 

11 In this fine mapping LD analysis, 48 unrelated BP-I patients from the CRCV were identified 

12 and genotyped using microsatellite markers spaced at narrow intervals across chromosome 

13 18. As these patients are descended from the same ancestral population as the patients in the 

14 pedigrees previously studied (CR001 and CR0O4), many of them should share disease 

15 susceptibility alleles inherited identically by descent (IBD) from one or a few common 

16 ancestors, and linkage disequilibrium (LD) should be present at marker loci surrounding the 

17 disease genes. 

18 The sample of 48 BP-I patients included 25 women and 23 men who were recruited 

19 from psychiatric hospitals and clinics in the CRCV. These patients were ascertained only on 

20 the basis of diagnosis and CV ancestry, and were not selected on the basis of history of BP 

21 illness in family members. A structured interview of each patient was conducted by a 

22 psychiatrist, and medical and hospital records were collected. Ascertainment and diagnostic 

23 procedures were as described above. However, in order to lessen further the probability of 

24 phenocopies among this unrelated sample, for which we lacked pedigree information, the 

25 affected phenotype was defined even more narrowly than in the pedigree study. Individuals 

26 considered affected in this study had to have suffered at least two disabling episodes of mania 

27 (requiring hospitalization) and a first onset of the illness before age 45. 

28 Genealogical research on each of the 48 BP-I patients confirmed that on average, 70% 

29 of their great-grandparents were born in the CRCV. Individuals whose great-grandparents 

30 were born in the CRCV were considered likely to be descended from the original Spanish 
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1 and Amerindian founders of the CRCV. Genealogical research showed that 2 patients are 

2 first cousins and the remaining 46 have no relationship within the past 4 generations. 

3 - 

4 Genotyping Pedigree Studies 

5 Linkage simulations were used to select the most informative individuals from 

6 pedigrees CR001 and CR004 for genotyping studies (Freimer, N.B., et al. (1996) 

7 Neuropsychiat. Genet. 67, 254-263). Under a 90% dominant model, simulation analyses 

8 with these individuals suggested that evidence of linkage would likely be detected (e.g. a 

9 probability of 92% of obtaining lod > 1.0 in the combined data set) using markers with an 

10 average heterozygosity of 0.75 spaced at 10 cM intervals (as discussed in Freimer, N.B., et 

1 1 al. (1996) Neuropsychiat. Genet. 67, 254-263). For the Stage 1 screen, the most 

12 polymorphic markers (307 in total) were chosen, placed at approximately 10 cM intervals on 

13 the 1992 Genethon map (Houwen, R., et al. (1992) Nature 359, 794-801). These markers 

14 were then supplemented by a small number of markers from the Cooperative Human Linkage 

15 Center (CHLC) public database. For the Stage II screen. 166 markers were added from 

16 newer Genethon and CHLC maps as they became available (Murray, J.C. et al. (1994) 

17 Science 265, 2049-2054, Gyapay, G., et al. (1994) Nature Genet. 7,246-339) and from the 

18 public database of the Utah Center for Genome Research, all of which are publicly available. 

19 DNA samples (from individuals in the CEPH families) that were used for size standards for 

20 Genethon and CHLC markers were included in the experiments to permit comparison of 

21 allele sizes between members of the CRCV population and individuals in the CEPH database. 

22 Genotyping procedures were as described previously (DiRienzo, A. et al. (1994) Proc. Natl. 

23 Acad. Sci. USA 91, 3166-3170 (incorporated by reference herein)). Briefly, one of the two 

24 PCR primers was labeled radioactively using a polynucleotide kinase and PCR products were 

25 run on polyacrylamide gels. Autoradiographs were scored independently by two raters. 

26 Data for each marker were entered into the computer database twice and the resultant files 

27 were compared for discrepancies. 
28 
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1 Genotyping of Unrelated BP-I CRCV Patients 

2 Twenty-seven markers were used to genotype all 48 individuals (as well as 53 

3 individuals used to establish genetic phase) at approximately 5 cM intervals along the entire 

4 chromosome 18. It was hypothesized that such a screen would permit the evaluation of 

5 evidence in the 18pter region and also to investigate other regions on chromosome 18 in 

6 which linkage to BP has been suggested by other groups in other populations. For each 

7 individual, two-marker haplotypes in each of the 26 inter-marker intervals were investigated. 

8 For 38 of the 48 BP-I patients, genotypes of parents or children were available to assist in 

9 phase determination. Because of phase ambiguities in the remaining 10 individuals, minimal 

10 and maximal two-marker haplotype sharing was evaluated as follows: (1) Minimal: the 

11 number of individuals (and chromosomes) who definitely shared a chromosomal segment 

12 defined by a particular pair of alleles (phase known chromosomes) and (2) Maximal: the 

13 number of individuals (and chromosomes) who could possibly share a chromosomal segment 

14 defined by a particular pair of alleles (includes phase unknown chromosomes). The threshold 

15 used to identify areas of high IBD sharing of chromosomes in this initial screen was 

16 designated as maximal sharing of a two-marker haplotype by 50% or more of the 48 

17 individuals (or 25% or more of the 96 chromosomes). 

18 Arbitrary thresholds were designated to identify possible areas of high IBD sharing 

19 among the 48 patients. Eight of the 26 regions passed this screen. Within each of these 3 

20 regions, one to three additional markers were typed to permit detection of LD, if present, 

21 over regions of one to two cM. 

22 A total of 42 chromosome 18 markers were used to genotype the study sample: 

23 D18S1140, D18S59, D18S476, D18S481, D18S391. D18S452, D18S843, D18S464. 

24 D18S1153, D18S378. D18S53, D18S453, D18S40. D18S66, D18S56, D18S57, D18S467, 

25 D18S460, D18S450, D18S474, D18S69, D18S64, D18S1134, D18S1147, D18S60, D18S68, 

26 D18S55, D18S477, D18S61, D18S488, D18S485, D18S541, D18S870, D18S469, D18S874, 

27 D18S380, D18S1121, D18S1009. D18S844, D18S554, D18S461, DI8S70 (from pter to 

28 qter). Of these 42 markers, four are located within the 5 cM 18pter region extending from 

29 the telomere of 18p to marker D18S481 (inclusive), which is approximately 5 cM from the 
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1 telomere of 18p. This region is referred to as the 5 cM 18pter region The four markers 

2 tested in the 5 cM 18pter region are: D18S59, D18S1140, D18S476 and D18S481. 

3 For each marker the likelihood thai a particular allele (or alleles) is over-represented 

4 on disease chromosomes, as compared to non-disease chromosomes was evaluated. The 

5 results of this likelihood test provide a conservative but powerful measure of LD between 

6 two loci. 

7 - 

8 Pedigree Statistical Analyses 

9 Two-point linkage analyses were performed for all markers. Marker allele 

10 frequencies were estimated from the combined data set with correction for dependency due to 

11 family relationships (Boehnkc, M. (1991) Am. J. Hum. Genet. 48. 22-25). The linkage 

12 analyses for Stages I and II included the 65 individuals who were genotyped as well as an 

13 additional 65 individuals who had been diagnosticaily evaluated but not genotyped. Only 

14 individuals with BP-I were considered affected with the exception of two persons, one in 

15 each family, who carry diagnoses of schizoaffective disorder manic type (SAD-M). The 

16 SAD-M individuals were included as affected because BP-I and SAD-M are often difficult to 

17 distinguish from each other based on their clinical presentation and course of illness 

18 (Goodwin, F.K. et al. (1990) in Manic Depressive Illness (Oxford University Press, New 

19 York), pp. 373-401; Freimer, N.B et a!. (1993) in The Molecular and Genetic Basis of 

20 Neurological Disease, pp. 951-965; Freimer, N.B. ct al. (1996) Neuropsychiat. Genet, 67, 

21 254-263; and Freimer, N.B. ct al (1996) Nature Genetics 12:436-441, all incorporated by 

22 reference herein). In all, 20 individuals were designated as affected within CR004 

23 (Copeman, J.B., et al. (1995) Nature Genet. 9, 80-85 available for genotyping) and 

24 10 individuals from CR001 (Kelsoe, J.R. et al. (1989) Nature 342, 238-243 available for 

25 genotyping). The phenotype for ail other individuals was designated as unknown except for 

26 17 individuals who were designated as unaffected because they had been thoroughly clinically 

27 evaluated, showed no evidence of any psychiatric disorder, and were well beyond the age of 

28 risk (50) for BP-I (linkage simulation srudies indicated that these unaffected individuals 

29 contributed little information to the linkage analysis). 
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1 Linkage analyses were performed using a nearly dominant model (assuming 

2 penetrance of 0.81 for heterozygous individuals of 0.9 for homozygotes with the disease 

3 mutation). This model was chosen from five different single-locus models (ranging from 

4 recessive to nearly dominant) due to its consistency with the segregation patterns of BP in the 

5 two pedigrees and because it had demonstrated the greatest power to detect linkage in 

6 simulation studies (Freimer, N.B., et al. (1996) Neuropsychiat. Genet. 67, 254-263). Based 

7 on Costa Rican epidemiological surveys Escamilla, M.A., et al., (1996) Neuropsychiat. 

8 Genet. 67, 244-253, the population prevalence of BP-I was assumed to be 0.015 (and thus 

9 the frequency of the disease allele was assumed to be 0.003)(based on epidemiological 

10 surveys in Costa Rica, Adis, G. (1992) "Disordenes mentales en Costa Rica: Observaciones 

11 Epidemiologicas." (San Jose, Costa Rica: Editorial Nacional dc Salud y Seguridad Social)). 

12 The frequency of BP-I in individuals without the disease allele was conservatively set at 0.01 

13 which effectively specified a population phenocopy rate of 0.67 (i.e., an affected individual 

14 in the general population has a 2/3 probability of being a phenocopy). For multiply affected 

15 families, the probability that a gene segregates is highly increased, which implies that 

16 affected individuals in our study pedigree have a lower probability to be phenocopies than 

17 affected individuals in the general population, particularly those with several affected close 

18 relatives (the exact probabilities are dependent on the degree of relationship between patients 

19 and the number of intervening unaffected individuals). These parameters were chosen to 

20 ensure that most of the linkage information derives from affected individuals. The rationale 

21 for selecting these parameters and results of analyses that demonstrate the conservatism of 

22 this model are described by Freimer, N.B., et al. (1996) Neuropsychiat. Genet. 67, 254-263. 

23 The LINKAGE package (Lathrop et al., (1984) Proc. Natl. Acad. Sci. USA 81, 3443-3446) 

24 was used for lod score analysis znd to obtain maximum likelihood estimates of the marker 

25 allele frequencies, taking into account the existing family relationships (see Boehnke, Am. J. 

26 Hum. Gent. 48, 22-25 (1991)). 
27 

28 Unrelated BP-I CRCV Patient Statistical Analyses 

29 A likelihood test of disequilibrium (J. Terwiliiger, Am. J. Hum. Genet. 56, 777 

30 (1995)) was used to estimate a single parameter, lambda, that quantifies the over- 
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1 representation of marker alleles on disease chromosomes as compared to non-disease 

2 chromosomes. We chose this method of analysis over another commonly used 

3 disequilibrium analysis method, the transmission disequilibrium test (TDT, R. Spieiman et 

4 a!., Am. J. Hum. Genet. 52, 506 (1993)) because data from all 48 BP-I patients could be 

5 used in the likelihood approach. Effective use of the TDT requires phase-known, 

6 heterozygous parental chromosomes. We do not have parental genotypes for 20 of the 48 

7 BP-I patients. Simulations indicated that with our data, the likelihood test of disequilibrium 

8 would be more powerful than the TDT. Lambda has been shown to be a superior measure 

9 for-LD fine mapping, compared to other frequently used measures, because it is directly 

10 related to the recombination fraction between the disease and the marker loci. Non-disease 

11 chromosomes were chosen from the phase-known chromosomes of parents, spouses and 

12 children of affected individuals, if available. Designation of chromosomes of family 

13 members as non-disease in a disorder such as BP-I, which is not fully penetrant, necessitates 

14 specifying a model of disease transmission. The same model of transmission was emptoyed 

15 in this LD likelihood test as was used in the initial genome screen of the pedigrees CR001 

16 and CR002 described herein. One parameter was specified differently from the genome 

17 screen: the phenocopy rate was set to zero in the LD likelihood analysis. A phenocopy rate 

18 was not specified in the transmission model because the effect of phenocopies will be 

19 "absorbed" by the lambda parameter, in that presence of phenocopies in our sample will 

20 serve to erode the association between marker alleles and disease, and hence reduce the 

21 estimate of lambda. 
22 

23 Coverage 

24 To access coverage for a marker, the number of informative meioses at the estimated 

25 recombination fraction was calculated using the estimate of the variance (the inverse of the 

26 information matrix) (Petrukhin, K.E. et al. (1993) Genomics 15, 76-85). Alternatively, 

27 when the estimated frequency of recombination was close to 0 or 1, Edwards' equation was 

28 applied to calculate the equivalent number of observations (Edwards, J.H. (1971) Ann. Hum. 

29 Genet. 34, 229-250). These meioses represent the amount of linkage information provided 

30 by the marker, given the pedigree structure and the genetic model applied. Linkage to the 
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1 marker in question was then assumed and the lod score thai would be observed as a disease 

2 gene is hypothetical ly moved in increments away from that marker was calculated. All 

3 regions around a marker that would have generated a lod score that exceeded our thresholds 

4 for possible linkage (0.8 in CR001, 1.2 in CR004, and 1.6 in the combined data) were 

5 considered covered. These lod score thresholds were derived from simulation analyses 

6 showing the expected distribution of lod scores under linkage and non-linkage (Freimer, 

7 N.B., et al. (1996) Neuropsychiat. Genet. 67, 254-263, and approximately represent a result 

8 that is 250 times more likely to occur in linked simulations than in unlinked simulations. 

9 Coverage maps were constructed (FIG. 1) by superimposing the regions covered by each 

10 marker on the genetic map of each chromosome. At die end of the Stage II screen, a total of 

11 473 microsatellite markers had been typed with genome coverage (in the combined data set) 

12 of over 94%. Possible coverage gaps are indicated by unshaded areas and are mainly 

13 concentrated near telomeres. Because the coverage calculations make use of marker 

14 informativeness within the pedigrees, the coverage approach thus permits detection of 

15 instances where markers with expected high heterozygosities are uninformativc in our data 

16 set. 
17 

18 Pedigree Linkage Analysis Results 

19 Of the 473 microsatellites analyzed with two-point linkage tests. 23 markers exceeded 

20 the empirically determined thresholds designated for the coverage calculations (in either 

21 CR001, CR004, or in the combined data set). The location of these markers, the peak lod 

22 scores obtained in each family and in the combined data set, and the maximum likelihood 

23 estimate of the recombination fraction (0) at which these lod scores were observed are 

24 indicated in Table 1. The approximate chromosomal locations of these markers are also 

25 depicted in FIG. 1. The distribution of lod scores (for the maximum likelihood estimate of 0 

26 in the combined data set) across the genome is displayed by chromosome in FIG. 2. 

27 The threshold was exceeded for pedigree CR001 in two adjacent markers near the 18p 

28 telomere (D18S59 and D18S1105), but CR004 displayed no suggestion of linkage in this 

29 region. 
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1 In the genome screen, the highest lod score observed for family CR001 alone was at 

2 D18S59 (1.32 at 0 = 0.0), located near pter. All affected members of CR001 shared alleles at 

3 markers in the 18pter region. 
4 

5 Unrelated BP-I CRCV Patient Study Results 

6 Out of the forty-two markers tested, eight displayed evidence of over-representation 

7 of a particular allele on disease chromosomes. Eight of the 42 markers had -2*ln(likelihood 

8 ratio) statistics > 1.0. Three other markers had -2*ln(Iikelihood ratio) statistics >0 and 

9 <0.62. The results are shown in Table I: 

0 Table I 



11 

12 
13 


Marker 


Allele Size 


Frequency on 
non-disease 
Chromosomes 


Frequency on 

Disease 
Chromosomes 


14 


D18S59 


154 


0.121 


0.572 


15 . 


D18S476 


271 


0.470 


0.771 


16 


D18S467 


172 


0.384 


0.693 


17 


D18S61 


177 


0.074 


0.326 


18 


D18S485 


182 


0.237 


0.586 


19 


D18S870. 


179 


0.405 


0.657 


20 


D18S469 


234 


0.128 


0.450 


21 


D18S1121 


168 


0.171 


0.553 


22 











23 

24 Evidence for association was found at markers located near the telomere of the short 

25 arm of chromosome 18. D18S59 displayed the strongest evidence for LD (-2*ln(likelihood 

26 ratio) of 8.3, p=0.002) of all the chromosome 18 markers tested. An adjacent marker, 

27 D18S476 (-2*ln(likelihood ratio) of 1.3), also provided evidence of LD. In our genome 

28 screening pedigree study we observed the single highest lod score for pedigree CR001 of any 

29 marker in the entire genome at D18S59. Furthermore, the alleles at D18S59 and D18S476 
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1 that are over-represented among the BP-I patients from the population sample (154 b.p. and 

2 271 b.p. respectively) are observed in all BP-I patients from pedigree CR001. 

3 The LD and pedigree findings in the 5 cM 18ptcr region denote a clearly delineated 

4 region that contains a BP-I susceptibility locus. This region is distinct from other regions on 

5 chromosome 18 that have been suggested as linked to mood disorder phenotypes (more 

6 broadly defined than BP-I). See FIG. 6A, 6B, 6C. In contrast to previous reports by 

7 Berrettini ct al. and Stine et al., suggesting possible linkage between mood disorder and 

8 markers in the pericentromeric region of chromosome 18, our results did not show any 

9 evidence for association of BP-I with any pericentromeric markers (D18S378, D18S53, 
10 D18S453 or D18S40). 

11 

12 Identification Of New Markers From The 5 CM 18pter Region 

13 Cloned human genomic DNA covering the target region is assembled. Microsatellite 

14 sequences from these clones are identified. A sufficient area around the repeat to enable 

15 development of a PCR assay for genomic DNA is sequenced, and it is confirmed that the 

16 microsatellite sequence is polymorphic, as several uninformative rnicrosatelliies are expected 

17 in any set. Several methods have been routinely used to identify microsatellites from cloned 

18 DNA, and at this time no single one is clearly preferable (Weber. 1990, Hudson et al., 

19 1992). Most of these require screening an excessive number of small insert clones or 

20 performing extensive subcloning using clones with larger inserts. 

21 New strategies have recently been developed which permit the use of the several 

22 different microsatellites to be found within a single large insert clone without requiring 

23 extensive subcloning. A method for direct identification of microsatellites from yeast 

24 artificial chromosomes (YACs) provides several new markers from the target region. This 

25 procedure is based on a subtractive hybridization step that permits separation of the target 

26 DNA from the vector background. This step is useful because the human DNA (the YAC) 

27 constitutes only a small proportion of the total yeast genomic DNA. 

28 YAC clones (with inserts averaging about 750 Kb of human genomic DNA) that span 

29 the 5 cM 18pter region have already been identified by the CEPH/Genethon consortium 

30 (Cohen et al., 1993) and are publicly available. The markers from YACs that have been 
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1 mapped to portions of the candidate region that are not well represented by currently 

2 available markers are first isolated. By typing these markers in the families and the "LD" 

3 sample, as described above, it is possible to narrow the candidate region, perhaps to a size of 

4 less than one to two cM, thus permitting limitation of the segment in which more extensive 

5 mapping efforts are applied. 

6 Briefly, the rnicrosatellitc identification procedure is performed as follows: A 

7 subtractive hybridization is performed using genomic DNA from a target YAC together with 

8 an equivalent amount of a control DNA. This procedure separates the YAC DNA from that 

9 of the yeast vector. Following the subtraction procedure the subtracted YAC DNA is 

10 purified, digested with restriction enzymes and cloned into a plasmid vector (Ostrander et aL, 

1 1 1992). The cloned products of each YAC are screened using a CA(15) oligonucleotide probe 

12 (i.e. an oligonucleotide having 15 CA repeats). Each positive clone (i.e. those that contain 

13 TG-repeats) is sequenced to identify primers for PCR to genotype the BP-I samples. 

14 An alternative approach, based on using a set of degenerate sequencing primers that 

15 anneal directly to the repeat sequence, permitting direct thermal cycle sequencing (Browne & 

16 Litt, 1992), can also be used. 

17 Once the candidate region is narrowed to a size of less than about 500 to 1000 Kb, a 

18 contiguous array (contig) of clones with smaller inserts than YACs, mainly PI clones, is 

19 developed. PI clones are phage clones specially designed to accommodate inserts of up to 

20 100 Kb (Shepherd et aL, 1994). 
21 

22 Development Of A Physical Map Of The 5 cM 18pter Region 

23 Ir * parallel with the genetic mapping, a physical map of the 5 cM 18pter region is 

24 developed. The backbone of this effort is the assembly of contigs of large insert clones. 

25 Low resolution contigs for most of the human genome are already available using the YACs 

26 developed by CEPH (Cohen et aL, 1993). Although these have been individually verified 

27 and checked for overlap with other YACs, there is a high rate of chimerism in the YACs and 

28 insufficient evidence to definitively confirm the order of the YACs. In addition, because of 

29 their large size these YACs are particularly cumbersome to work with. Nevertheless, they 

30 provide a useful framework to start constructing high resolution contigs. 
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1 Once a candidate region of less than about five cM is delineated, the studies to 

2 develop a physical map are commenced. Because of the disadvantages of relying solely on 

3 YACs, and because positional cloning is facilitated by the availability of a higher resolution 

4 map, contigs are generated using PI clones once the candidate region is narrowed to less 

5 than one Mb, by LD mapping in the expanded population sample using the new markers 

6 identified from the YACs. 



7 Once a region of 500-1000 Kb or less is defined, physical mapping and cloning are 

8 computed using PI clones rather than YACs, and PI contigs over such a region are 

9 constructed. The Pis are used to identify additional markers for the further positional 

10 cloning steps as well as the screening for rearrangements. 

1 1 The starting point of contig construction is the microsatellite sequences and non- 



12 polymorphic STSs that derive from the few YACs that surround the genetically determined 

13 candidate region. These STSs are used to screen the PI library. The ends of the Pis are 

14 cloned using inverse PCR and used to order the Pis relative to each other. Amplification in 

15 a new Pi will indicate that it overlaps with the previous one. Fluorescent in situ 

16 hybridization (FISH) permits ordering of the majority of the Pis (Pinkel, 1988; Lichter, 

17 1991). The original set of Pis serves as building blocks of the complete contig; each end 

18 clone is used to re-screen the library and in this way Pis are added to the map. 



19 From each PI additional microsatellites are identified as previously described. This 

20 allows further reduction of the candidate region. When the region is narrowed to less than 

21 one Mb in size, positional cloning efforts are initiated. 

22 Fine Mapping of 5cM 18PTER Region 

23 In order to delineate further regions of BP-I susceptibility within the 5 cM 18pter 



24 region, additional unrelated BP-I patients from the CRCV as well as other populations can be 

25 diagnosed and genoryped both with the markers described herein as well as additional 

26 markers in the 5 cM 18pter region that are known as well those yet to be identified. 

27 Additional markers are available from the Cooperative Human Linkage Center (CHLC) 
.28 public database, from newer Genethon and CHLC maps as they become available (Murray, 

29 J.C. et a!. (1994) Science 265, 2049-2054, Gyapay, G., et al. (1994) Nature Genet. 7,246- 

30 339) and from the public database of the Utah Center for Genome Research (all of which are 
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1 incorporated by reference herein). The web addresses for Genethon and CHLC are: 

2 Genethon (http://www.genethon.fr/genethon_en.html), CHLC 

3 (http://gopher.chlc.org/HomePage.html). These databases are all linked, and one of ordinary 

4 skill in the an can readily access the information available from these databases. 

5 The markers shown in FIG. 6A, from number 1 to 22 or 23 can be used to genotype 

6 the CRCV pedigrees and unrelated BP-I patients described herein as well as other BP-I 

7 affected individuals and pedigrees. See FIG. 6A (portion of a chromosome 18 map available 

8 from the Whitehead Institute, web address: http://133.30.8- 1:8080/ = ®'= : www- 

9 genome.wi.mit.edu. (incorporated herein by reference)). The fine mapping techniques 

10 described herein in conjunction with the teachings regarding the 5 cM 18pter .region can be 

1 1 used to narrow the BP-I susceptibility region further. 

12 The following markers (listed in order of occurrence from the telomere towards the 

13 centromere) were used to delineate regions of BP-1 susceptibility within the 5 cM 18pter 

14 region: SAVA5, ca211, ca212, D18S1140. D18S59, ca231, ta201. AT201, ca225, w3442, 

15 ca213, ga201, ga203, ca2l9, D18S1105, ca209, ca202. D18S1146, GATA (referred to in the 

16 figures as I66d05) and D18S476. The markers SAVA5 t D18S1140, D18S59, ta201, at201, 

17 w3442, ga201, ga203, D18S1105. D18S1146, GATA and D18S476 were used in both the 

18 haplotype analysis (Figure 8) and the AHR analysis (Figure 1 1) to delineate the BP-I 

19 susceptibility locus to die 500 kb region defined by the markers SAVA5 and ga203 and the 

20 300 kb region defined by D18S1140 and W3422. The other markers were used in both 

21 haplotype and the AHR analyses as confirmatory evidence for the localizations. Blood 

22 samples from 105 affected individuals were tested for die presence of marker haplotypes and 

23 compared to marker haplotypes detected on the non-transmitted chromosome in samples 

24 obtained from the parent(s) of the affected individuals when available (71 cases) or to 

25 markers detected in samples obtained from a control population of students attending the 

26 University of Costa Rica (52 samples). The non- transmitted chromosomes are well matched 

27 as controls allowing the affected haplotype of the transmitted chromosome to be more easily 

28 discerned than through comparison with data obtained from the general population that may 

29 contain individuals who carry the affected haplotype but do not exhibit clinical symptoms of 

30 bipolar mood disorder. 
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1 Figure 7 provides 18p allele frequencies for disease (aff 105) versus nontransmitted 

2 (ntrans) chromosomes and samples from the control population of students (control). The 

3 name of each marker used in this study is indicated on ihe left. The second column of 

4 numbers refers to allele length in basepairs. This data provides evidence of over- 

5 representation of a particular allele on disease chromosomes. 

6 Figure 8 summarizes the results obtained with affected individuals. The column 

7 labelled 18p refers to the patient identifier, and each patient identifier is repeated to indicate 

8 results with both copies of chromosome 18. The labels "PANR" and "MANR" refer to the 

9 paternal and maternal identifier, respectively, associated with the particular patient, other 

10 than 0, 1 and 2, which indicate that parental samples were not available. The allele length 

1 1 (base pairs) is indicated under each marker for a particular patient; the length of the 

12 horizontal bar in the figure reflects whether haplotypes are IBD or IBS, with IBD haplotypes 

13 with common ancestors having longer bars than randomly inherited IBS haplotypes. To the 

14 right of each marker, a "1" indicates that the phase is known, i.e., that it is known whether a 

15 particular allele is inherited from the paternal or maternal chromosome, and a M 0" indicates 

16 that the phase is not known for sure. The determination of phase allows the practitioner to 

17 conclude that marker alleles are linked in a haplotype on the same disease causing 

18 chromosome. 

19 Figure 9 provides similar data for non-transmitted chromosomes obtained from 

20 parental samples. Some individuals exhibited the affected haplotype indicating that the parent 

21 was homozygous; however, these regions of identity were typically much shorter than those 

22 regions observed in affected individuals, indicating that they were IBS. 

23 Figure 10 similarly provides data for an unscreened population of students 

24 from the University of Costa Rica and their parents (52 samples in total). The data 

25 demonstrate that there is a lower incidence of the affected haplotype in the general population 

26 as compared with Figure 8 and that the affected haplotype is typically shorter similar to the 

27 results obtained with non-transmitted chromosomes. However, the results for the general 

28 population is less distinctive than that observed for non-transmitted chromosomes in allowing 

29 one to map the affected haplotype. 
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1 Comparison of the affected haplorype with non-transmitted chromosome markers 

2 indicate that the region of maximal sharing between affected individuals occurs between 

3 1 140t and w3442 on chromosome 18. This region encompasses approximately 300 kb. 

4 The data was analyzed further using Ancestral Haplotype Reconstruction (AHR), a 

5 likelihood method for measuring LD. Data from affected individuals are examined in 2- 

6 marker segments. Within each segment, the multinomial likelihood of each of the possible 

7 ancestral haplotypes giving rise to the observed sample of disease haplotypes is calculated. 

8 This likelihood is calculated assuming some fraction, a. of disease chromosomes are 

9 associated with this 2-marker segment, and (1-a) are linked to this segment. These 

10 haplotype likelihoods are weighted by the probability of observing that haplotype in the 

11 population, and summed to create an overall likelihood for the 2-marker segment. This 

12 segment likelihood is compared to the null likelihood, which assumes the disease and 

13 markers are unlinked (and therefore a = 0), and a LOD score is generated. The LOD score 

14 is maximized over the parameter a. Details of these calculations are presented in Appendix 

15 A. The results of this analysis are shown in Figure 1 1. The percentages given above the 

16 diagonal line demarcated by the filled boxes indicate the percentage of disease chromosomes 

17 hypothesized to be true chromosomes from a common founder. For example, 17% of 

18 chromosomes obtained from affected individuals have the 18S59 to W3442 region; i.e., as 

19 each individual has two chromosome copies, 34% of individuals have this region. The 

20 number above each percentage indicates the LOD score. The numbers given below the 

21 diagonal line demarcated by the filled boxes indicate the alleles inherited from a common 

22 founder, with the number prior to the dash indicating the allele of the marker identified in 

23 the column heading and the number following the dash indicating the allele of the marker 

24 identified in the row heading. The marker alleles are referred to as follows: 
25 
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1 


MARKER 


# 


ALLELE LENGTH 


2 


SAVA5 


2 


229 


3 


CA21 1 


3 


195 


4 


18S1 140 


2 


268 


5 


18S59 


4 


154 


6 


18S59 


6 


158 


7 


TA201 


2 


220 


8 


TA201 


3 


230 


9 


CA231 


2 


186 


10 


CA231 


. 4 


202 


n 


AT201 


1 


170 


12 


AT201 


2 


178 


13 


CA225 


1 


160 


14 


CA225 


3 


172 


15 


W3442 


1 


10 



16 Blank boxes indicate no positive evidence for linking the indicated region to the affected 

17 chromosome. 
18 

19 Use Of PI Clones To Identify Candidate cDNAs For Screening For Mutations 

20 In The DNA Of BP-I Patients 
21 

22 The PI clones described above are used to identify. candidate cDNAs. The candidate 

23 cDNAs are subsequently screened for mutations in DNA from BP-I patients. From the 

24 minimal candidate region defined by genetic mapping experiments a segment is left that is 

25 sufficiently large to contain multiple different genes. 
26 

27 Identification Of Coding Sequences 

28 Coding sequences from the surrounding DNA are identified, and these sequences are 

29 screened until a probable candidate cDNA is found. Much of the human genome will be 

30 sequenced over the next few years, in which case it may become feasible to identify coding 

31 sequences through database screening. Candidates may also be identified by scanning 
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1 databases consisting of partially sequenced cDNAs (Adams et aL, 1991), known as expressed 

2 sequence tags, or ESTs. These resources are already largely developed, and include upwards 

3 of 100,000 cDNAs. the majority expressed primarily in the brain. It is not yet clear, 

4 however, that the complete set of cDNAs will be mapped to specific chromosomal locations 

5 in the near future, and that their data will soon be made publicly available. The database can 

6 be used to identify all cDNAs that map to the minimal candidate region for BP-I. These 

7 cDNAs are then used as probes to hybridize to the PI contig, and new microsatellites are 

8 isolated, which are used to genotype the "LD M sample. Maximal linkage disequilibrium in 

9 the vicinity of one or two cDNAs is identified These cDNAs are the first ones used to 

10 screen patient DNA for mutations. Database screening has already been used to identify a 

1 1 gene responsible for familial colon cancer (Papadopolous et aL, 1993). 

12 Coding sequences are also identified by exon amplification (Duyk et ai., 1990; 

13 Buckler et al., 1991). Exon amplification targets exons in genomic DNA by identifying the 

14 consensus splice sequences that flank exon-intron boundaries. Briefly, exons are trapped in 

15 the process of cloning genomic DNA (e.g. from Pis) into an expression vector (Zhang et aL, 

16 1994). These clones are transfected into COS cells, RT-PCR is performed on total or 

17 cytoplasmic RNA isolated from the COS cells using primers that are complementary to the 

18 splicing vector. Exon amplification is tedious but routine; for example, the system developed 

19 by Buckler et aL (1991). This method is probably preferable to another widely used 

20 approach, direct selection, which involves screening cDNAs using large insert clone contigs, 

21 with several steps to maximize the efficiency of hybridization and recovery of the appropriate 

22 hybrid (Lovett et aL, 1991). Although direct selection is more efficient than exon 

23 amplification (Del Mastro et aL, 1994), it may not be practical as it depends on the candidate 

24 cDNA being expressed in the tissue from which the cDNA library was made; there is no 

25 prior information to indicate the tissue or developmental stage in which BP-1 genes would be 

26 expressed. 

27 Once cDNAs are identified the most plausible candidates are screened by direct 

28 sequencing, SSCP or using chemical cleavage assays (Cotton et aL 1988). 

29 The data are also evaluated for clues to the possible identity or mode of action of BP- 

30 I mutations. For example, it is known that trinucleotide repeat expansion is associated with 
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1 

2 
3 
4 



the phenomenon of anticipation, or the tendency for a phenotype to become more severe and 
display an earlier age of onset in the lower generations of a pedigree (Ballabio. 1993). 
Several investigators have suggested that segregation patterns of BP-I are consistent with 
antic.pation (Mclnnis et al.. 1993; Nylander et al., 1994). The apparent transmission of BP- 
5 I, in association with the conserved 18q23 haplotype is constant with anticipation. 

Therefore, once the. candidate region is narrowed to us minimal extent, the PI clones are 
screened using trinucleotide repeat oligonucleotides (Hummerich et al.. 1994). A PCR assay 

8 is developed and patient DNAs are screened for expanded alleles. 

9 Genetic and physical data help to map the bipolar mood disorder gene to the 5 cM 
18pter region of chromosome 18. New markers from this region are tested in order to locate 
the bipolar mood disorder gene in a region small enough to provide higher quality genetic 
tests for bipolar mood disorder, and to specifically find the mutated gene. Narrowing down 
the region in which the gene is located will lead to sequencing of the bipolar mood disorder 
gene as well as cloning thereof. Further genetic analysis employing, for example, new 

15 polymorphisms flanking D18S59 and D18S476 as well as the use of cosmids, yeast artificial 
chromosome (YAC) clones, or mixtures thereof, are employed in the narrowing down 
process. The next step in narrowing down the candidate region includes cloning of the 
chromosomal region 18pter including proximal and distal markers in a contig formed by 
overlapping cosmids and YACS. Subsequent subcloning in cosmids. plasmids or phages will 

20 generate additional probes for more detailed mapping. 

21 The next step of cloning the gene involves exon trapping, screening of cDNA 
libraries. Northern blots or rt PCR (reverse transcriptase PCR) of samples from affected and 
unaffected individuals, direct sequencing of exons or testing exons by SSCP (single strand 

24 conformation polymorphism), RNase protection or chemical cleavage. 

25 Flanking markers on both sides of the bipolar mood disorder gene combined with 
D18S59 and D18S476 or a number of well-positioned markers that cover the chromosomal 
region (5 cM 18pter) carrying the disease gene, can give a high probability of affected or 

28 non-affected chromosomes in the range of 80-90% accuracy, depending on the 

informativeness of the markers used and their distance from the disease gene. Using current 
markers linked to bipolar mood disorder, and assuming closer flanking markers will be 
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1 identified, a genetic test for families with bipolar mood disorder will be for diagnosis in 

2 conjunction with clinical evaluation, screening of risk and carrier testing in healthy siblings. 

3 In the future, subsequent delineation of closely linked markers which may show strong 

4 disequilibrium with the disorder, or identification of the defective gene, could allow 

5 screening of the entire at-risk population to identify carriers, and provide improved 

6 treatments. 
7 

8 Treatment of BP-I Patients Using Genotype Data 

9 Using the fine mapping techniques described herein, BP-I susceptibility loci or genes 

10 in the 5 cM 18pter region in particular in the region #1. between SAVA5 and ga203, are 

1 1 identified and used to genotype patients diagnosed phenotypically with BP-I. Preferably, 

12 markers in the roughly 500 kb region defined by SAVA5 and ga203, inclusive, are used. 

13 More preferably, markers in either the region defined by D18S59 and w3422, inclusive, are 

14 used. 

15 Genotyping with the markers described herein as well as additional markers permits 

16 confirmation of phenotypic BP-I diagnoses or assist with ambiguous clinical phenotypes 

17 which make it difficult to distinguish between BP-I and other possible psychiatric illnesses. 

18 A patient's genotype in the 5 cM 18pter region is determined and compared with previously 

19 determined genotypes of other individuals previously diagnosed with BP-I. Once an 

20 individual is genotyped as having a BP-I susceptibility locus in the 5 cM 18pter region, the 

21 individual is treated with any of the known methods effective in treating at least certain 

22 individuals affected with BP-I, such as the administration of lithium salts, carbamazepine or 

23 valproic acid. 

24 Studies are conducted correlating effective treatments with BP-1 genotypes in the 5 

25 cM 18pter region to determine the most effective treatments for particular genotypes. BP-I, 

26 patients can then be genotyped in the 5 cM 18pter region and the statistically most effective 

27 treatment can be determined as a first course of therapy. 

28 All publications and patent applications mentioned in this specification are herein 

29 incorporated by reference to the same extent as if each individual publication or patent 

30 application was specifically and individually indicated to be incorporated by reference. 
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1 The invention now being fully described, it will be apparent to one of ordinary skill 

2 in the art that many changes and modifications can be made thereto without departing from 

3 the spirit or scope of the appended claims. 
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Appendix A 

Consider the original mutation to have occurred on a chromosomal segment be- 
tween two markers A and B. At the time the mutation was introduced, there were 
n„ alleles at marker A and „ t alleles at marker B. On the chromosome containing 
the disease mutation both marker A and marker B carried allele X. The probabil- 
ity that after g generations an affected individual carrying the original disease 
mutation would still have allele X at markers A and B is. 

( i -e,) s ( 1-8,)' + ( i -e,)*( i .( i -e 2 mx a ) + < i -( i -e.^oe^frxj + 
eq(,) (i-d-e.^xi-d-e.^fcxjfcx,,) 

where 6, is the recombination fraction between disease and marker A. 6, is the re- 
combination fraction between disease and marker B, g is the number of genera- 
tions since founding (i.e. since the mutation was introduced into the population), 
f(XJ is the population frequency of the X-allele at marker A and f(X B ) is the 
population frequency of the X-allele at marker B. This equation includes terms 
for the possibility of recombination between the markers and the disease locus, 
with the X-allele at the markers then being identical by state (IBS) rather than 
IBD. The probabilities of an affected individual with the original mutation having 
other haplorypes can be formulated similarly. The probability of having allele Z 
at marker B (where Z is any allele at marker B besides X) would be: 
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( i -e,)«( i .< i -e 2 )')f(z B ) + ( i .( i -e, )«)< i -( i -e,)»)f(x jf(z B ) 

eq(2) 

where f(Z B ) is the frequency of allele Z at marker B in the population. The prob- 
ability of having allele Z at marker A (where Z is any allele at marker B besides 
X) would be : 

( i -e 2 )'( i -( i -e,)')f(zj + -e,)")( i -( l -e^fcx^zj 

eq (3) 

where f(Z A ) is the frequency of allele Z at marker A in the population. Finally, 
the probability of having allele Z at both markers A and B would be: 

(KI-eiyXKl-BimZJUZ.) 

eq (4) 

These probabilities assume (1) no interference in recombination and (2) the same 
marker alleles are present now as were present g generations ago, in similar fre- 
quencies. If, for example, marker A has n a alleles and marker B has n b alleles, 
then these probabilities form a n b ) by (/j 0 )-( n b ) transition matrix, with row / 
containing the probabilities that founder haplotypc / gave rise to each of the (n a ).( 
n b ) different haplotypes in g generations. The rows of this transition matrix sum 
to i. 
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in simulations, the haplotype frequencies in the disease population were formu- 
lated using these transition probabilities, assuming the disease arose on a haplo- 
type with the 41 1" allele at each of the two markers. 

Once these transition probabilities are estimated, the likelihood of a particular 
founder chromosome giving rise to the observed sample of disease haplotypes in 
g generations is easily estimated. For example, if one assumed that the disease 
mutation arose on a chromosome with the X-allele at both markers, the likelihood 
(L x . x ) that this chromosome was the founder of the present-day sampled disease 
chromosomes is given by the multinomial: * ■ 

K 

eq(5) 

where / indexes the K potential haplotypes for the two markers {K={n e ){ n>)) t p x , Xi 
is the probability that the ancestral disease chromosome with the X-allele at both 
markers gave rise to a haplotype of type / in g generations (taken from the transi- 
tion matrix), and Y; is the observed number of haplotype 7 in the sample 
(£,0O=lhe number of chromosomes in the sample to be analyzed). The likeli- 
hood in eq (5) assumes that all affected individuals are independent. While, after 
many generations of separation from a common ancestor one might consider these 
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individuals to be independent, they are in fact related through a complex and un- 
known pedigree. The simplification of considering individuals as independent 
makes the likelihood much more tractable to compute. 

The K likelihoods are then summed, and weighted by the probability of observing 
that particular haplotypc in the population to produce an overall likelihood for the 
2-marker segment: 

eq(6) (=l 

where/ is the frequency of haplotype / in the population. This overall likelihood 
calculation parallels the approach taken by Terwiliigcr (1995, eq (2)). The 
haplotype frequencies are estimated from the sample of normal chromosomes. In 
the event that the haplotype resulting in the largest contribution to the overall 
likelihood in eq (6) is not observed in the normal sample, the upper 95% confi- 
dence interval for this frequency is used, and the remaining haplotype frequencies 
rescaled accordingly. 

This overall likelihood is compared to the null likelihood, which is generated in 
exactly the same manner, except that it is assumed the markers were unlinked to 
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the disease locus (G^G^O.S in, for example, eqs (1-4)). The log l0 of this likeli- 
hood ratio is a LOD score. One might consider to. use in the null likelihood tran- 
sition probabilities calculated under the assumption of linkage equilibrium. Under 
this null the cells of the transition matrix are computed by multiplication of allele 
frequencies, assuming independence of marker loci. These two forms of the null 
likelihood are equivalent in value for g of approximately 20 or greater, and for 
g<20 the values are nearly equivalent. 

Because 0, and 0 2 are obviously unknown, the putative disease locus is set to be in 
the middle of the segment and therefore 0 t and 0, are one-half the genetic distance 
(converted to recombination fraction by the Haldane mapping function, (Ott 
199!)) between the two marker loci forming the segment. In fact, one could esti- 
mate 9, and 0,, or their ratio, and the method could easily be modified to do so, 
however for our purposes finding a linked segment is suitable. 

This basic procedure has been modified to deal with heterogeneity in the sample 
of disease chromosomes. Not all chromosomes in the disease sample may be true 
disease chromosomes from a common founder. Individuals heterozygous for the 
disease mutation will add one chromosome to the disease sample that will not be a 
true disease chromosome. Additionally, affected individuals not linked to the 

21363180 
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particular chromosomal segment being analyzed (either because they are pheno- 
copies or because of locus heterogeneity) will contribute two chromosomes to the 
affected sample that do not harbor this disease locus. When the null hypothesis of 
no linkage is not true, some fraction, ct, of the chromosomes in the disease sample 
will associated with this chromosomal segment, and (1-ct) will not be associated. 
We decided to examine a in steps of 0.1, from 1.0 to 0.0, and for each step in a 
produce a new transition matrix under the alternative hypothesis and calculate a 
LOD score. If we cdll the transition matrix calculated under the alternative hy- 
pothesis (where the disease locus is hypothesized to be in the middle of the 2- 
marker segment) T c and call the transition matrix calculated under the null hy- 
pothesis (where the disease locus is unlinked to the 2-marker segment). T m , then a 
new transition matrix for the alternative hypothesis is calculated as: 

T\ =aT a +(\-a)T 

cq(7) 

The transition matrix under the null uses a=0. The LOD score is then maximized 
over the one parameter a. 
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1 WHAT IS CLAIMED IS : 

2 

3 1: A method of detecting the presence of a bipolar mood disorder susceptibility locus in 

4 an individual comprising: 

5 analyzing a sample of DNA from said individual for the presence of a DNA 

6 polymorphism on the short arm of chromosome 18 between SAVA5 and ga203, wherein said 

7 DNA polymorphism is associated with a form of bipolar mood disorder. 
8 

9 2. The method of claim 1, wherein said DNA polymorphism is located on the .short arm 

10 of chromosome 18 between D18S1140 and ga203, inclusive. 
11 

12 3. The method of claim 1. wherein said DNA polymorphism is located on the short arm 

13 of chromosome 18 between SAVA5 and W3422, inclusive. 
14 

15 4. The method of claim 1, wherein said DNA polymorphism is located on the short arm 

16 of chromosome 18 between D18S1140 and W3422, inclusive. 
17- 

18 5. The method of claim 1, wherein said DNA polymorphism is located on the short arm 

19 of chromosome 18 between D18S1140and at201, inclusive. 
20 

21 6. The method of claim 1, wherein said DNA polymorphism is located on the short arm 

22 of chromosome 18 between D18S1140 and ta201, inclusive. 
23 

24 7. The method of claim 1, wherein said DNA polymorphism is located on the short arm 

25 of chromosome 18 between D18S59 and ta201, inclusive. 
26 
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1 8. The method of claim 1, wherein said analyzing further comprises: 

2 a. obtaining DNA samples from family members of said individual, 

3 b. analyzing said DNA samples from family members for the presence of said DNA 

4 polymorphism, and 

5 c. correlating the presence or absence of the DNA polymorphism with a 

6 phenotypic diagnosis of bipolar mood disorder for said individual and for said family 

7 members. 
8 

9 9. A method for detecting the presence of a DNA polymorphism linked to a gene 

10 associated with bipolar mood disorder in an individual comprising: 

11 a - Wing blood relatives of said individual for a DNA polymorphism located 

12 within a 500kb region of chromosome 18, wherein said region is located between SAVA5 

13 and ga203, inclusive, 

14 b - analyzing a DNA sample from said individual for the presence of said DNA 

15 polymorphism. 
16 

17 10. A method of genetically diagnosing bipolar mood disorder in an individual 

18 comprising: 

19 a. obtaining a DNA sample from said individual, 

20 b - analyzing said DNA sample for the presence of a DNA polymorphism 

21 associated with bipolar mood disorder, wherein said DNA polymorphism is located within a 

22 500 kb region of chromosome 18, wherein said region is located between SAVA5 and ga203, 
.23 inclusive. 

24 

25 11. A method of confirming a phenotypic diagnosis of bipolar mood disorder in an 

26 individual comprising: 

27 a. obtaining a DNA sample from said individual, 

28 b - analyzing said DNA sample for the presence of a DNA polymorphism 

29 associated with bipolar mood disorder, wherein said DNA polymorphism is Ifocated within a 
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1 500 kb region of chromosome 18, wherein said region is located between SAVA5 and ga203, 

2 inclusive. 

3 



4 
5 

6 13 



12. The method of claim 10, wherein said individual has Spanish or Amerindian ancestry. 



A method of classifying subtypes of bipolar mood disorder comprising: 
7 a. identifying one or more DNA polymorphisms located within a 500 kb region 

S of chromosome 18, wherein said region is located between SAVA5 and ga203, inclusive; and 



10 b. 



1 1 



analyzing DNA samples from individuals phenoiypically diagnosed with 



bipolar mood disorder for the presence or absence of one of more of said DNA 

12 polymorphisms. 
13 

14 14. A method of treating an individual diagnosed with bipolar mood disorder comprising: 

15 a - identifying one or more DNA polymorphisms located within a 500 kb region 

16 of chromosome 18, wherein said region is located between SAVA5 and ga203, inclusive; and 
17 

18 b - analyzing DNA samples from individuals phenotypicaliy diagnosed with 

19 bipolar mood disorder for the presence or absence of one of more of said DNA 

20 polymorphisms, and 

21 c - selecting a treatment plan that is most effective for individuals having a 

22 particular genotype within said 500 kb region of chromosome 18. 
23 

24 15. An isolated polynucleotide capable of selectively hybridizing with a DNA sample 

25 from an individual phenotypicaliy diagnosed with severe bipolar mood disorder, wherein said 

26 polynucleotide does not selectively hybridize with a DNA sample from an individual not 

27 affected by severe bipolar mood disorder, wherein said isolated polynucleotide selectively 

28 hybridizes with a complementary polynucleotide within a 500 kb region of chromosome 18, 

29 wherein said region is located between SAVA5 and ga203, inclusive. 
30 
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1 16. The isolated polynucleotide of claim 15, wherein said complementary polynucleotide 

2 is within a 500 kb region of chromosome 18, between SAVA5 and ga203, inclusive. 

3 



45. 



WO 98/07887 



POYUS97/14892 




WO 98/07887 



PCT7US97/14892 



Tabic 1. Lod scons for nuricen exceeding the arb.trary coverage thresholds. 
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FIGURE 3 
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FIGURE 5 




IChrlS: Contigs Anchored on Integrated 
Map 



Be parienx... This is a large image/ 



Chromosome Chrl8 




5in9ly-finktd VAC confix 



WC18.0 



NOTES 



' nJn fc^SEwS ? K P h? h ^ C 4 C ^ CnC hnkaEC m3 P ^ «nctlwn. and the radiation hybrid 
SXuST. 1!S ^"T^ are used to anchor YAOSTS conrigs. Wc only show 

f ™£ a ° on L - h y bnd "»PP<* STSs for which positive YAC§ arc present. For the 
to ramptele' gcSriVrnf f §C ""^ pubUshed in NaIure C *"«'« 7(2):246-339 (1994) for 

2. The apparent size of a contig on this map does not always correlate with the number of its members 
t^A^^L r C ° nUES m ™*™Uy expanded because of contradictions berwcelt the Son 
ESt rL^T*. °" C ° r more 1 marfcOT °« generic map, and adjacencies computed from YAC 
£S« fS£ 85 3ppCar 10 ovcria P «pr=scnt places where missing YAC data prevents the 
hvhrid fr^lZ T ng ^° r '? ? on \ e cs. conradictions between the order derived from the radiation 
nybnd map and the order derived from the STS content map 

' J hC Jf? C CCntral 8215 lhat a PP cars on man y of *c radiation hybrid maps corresponds to the 
centromere r 

4 . Markers derived from expressed sequence tags (ESTs) or other expressed sequences are colored red. 
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FIGURE 6A 

This STS is part of singly-linked contig WQ8.0 : 
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FIGURE 6B 
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FIGURE 6C 
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METHODS FOR TREATING BIPOLAR MOOD DISORDER 
ASSOCIATED WITH MARKERS ON CHROMOSOME 18p 
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10 INTRODUCTION 
Background 

Bipolar Mood Disorder (BP) 

Manic-depressive illness, or bipolar mood disorder (BP), is characterized by episodes 
15 of elevated mood (mania) and depression and is among the most prevalent and 
potentially devastating of psychiatric syndromes. The most severe and clinically 
distinctive forms of BP are BP-I (severe bipolar mood disorder) and SAD-M 
(schizoaffective disorder manic type), and are characterized by at least one full 
episode of mania, with or without episodes of major depression (defined by lowered 
20 mood, or depression, with associated disturbances in rhythmic behaviors such as 
sleeping, eating, and sexual activity). A milder form of BP is BP-n, bipolar mood 
disorder with hypomania and major depression. BP-I often co-segregates in families 
with more etiologically heterogeneous syndromes, such as unipolar major depressive 
disorder (MDD), which is a more broadly defined phenotype. See Mclnnes, L.A. 
25 and Freimer, N.B., Mapping genes for psychiatric disorders and behavioral traits, 
Curr. Opin. in Genet, and Develop., 5:376-381 (1995). 
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Treatment of Individuals With Bipolar Mood Disorder 
An estimated 2-3 million people in the United States are affected by BP-I. Currently, 
individuals are typically evaluated for bipolar mood disorder using the clinical criteria 
set forth in the most current version of the American Psychiatric Association's 
5 Diagnostic and Statistical Manual of Mental Disorders (DSM). Many drugs have 
been used to treat individuals diagnosed with bipolar mood disorder, including lithium 
salts, carbamazepine and valproic acid. However, none of the currently available 
drugs is able to treat every individual diagnosed with severe BP-I (termed BP-I) and 
drug treatments are effective in only approximately 60-70% of individuals diagnosed 
10 with BP-I. Moreover, it is currently impossible to predict which drug treatments will 
be effective in particular BP-I affected individuals. Commonly, upon diagnosis 
affected individuals are prescribed one drug after another until one is found to be 
effective. Early prescription of an effective drug treatment is critical for several 
reasons, including the avoidance of extremely dangerous manic episodes and the risk 
15 of progressive deterioration if effective treatments are not found. Also, appropriate 
treatment may prevent depressive episodes in BP-I individuals; these episodes are also 
dangerous and are characterized by a high suicide rate. The high prevalence of the 
disorder, together with frequent occurrence of hospitalizations, psychosocial 
impairment, suicide and substance abuse, has made BP-I a major public health 
20 concern. 

Genetic Basis for Bipolar Mood Disorder 

Mapping genes for common diseases believed to be caused by multiple genes, such 
as BP-I, may be complicated by the typically imprecise definition of phenotypes, by 
etiologic heterogeneity and by uncertainty about the mode of genetic transmission of 
the disease trait. With psychiatric disorders there is even greater ambiguity in 
distinguishing between individuals who likely carry an affected genotype from those 
who are genetically unaffected. For example, one can define an affected phenotype 
for BP by including one or more of the broad grouping of diagnostic classifications 
that constitute the mood disorders: BP-I, SAD-M, MDD, and BP-EL 
Thus, one of the greatest difficulties facing psychiatric geneticists is uncertainty 
regarding the validity of phenotype designations, since clinical diagnoses are based 
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solely on clinical observation and subjective reports. Also, with complex traits such 
as psychiatric disorders, it is difficult to map the trait-causing genes genetically 
because: (1) the BP-I phenotype doesn't exhibit classic Mendelian recessive or 
dominant inheritance patterns attributable to a single genetic locus, (2) there may be 
5 incomplete penetrance i.e., individuals who inherit a predisposing allele may not 
manifest the disease; (3) the phenocopy phenomenon may occur, i.e., individuals who 
do not inherit a predisposing allele may nevertheless develop the disease due to 
environmental or random causes; (4) genetic heterogeneity may exist, in which case 
mutations in any one of several genes may result in identical phenotypes. 

10 The existence of one or more major genes associated with BP-I and with a 

clinically similar diagnostic category, SAD-M (schizoaffective disorder manic 
subtype), is supported by segregation analyses and twin studies (Bertelson etal., 
1977; Freimer and Reus, 1992; Pauls et al., 1992). However, efforts to identify the 
chromosomal location of BP-I genes have yielded disappointing results in that reports 

15 of linkage between BP-I and markers on chromosomes X and 11 could not be 
independently replicated nor confirmed in the re-analyses of the original pedigrees 
(Baron et a!., 1987; Egeland et al., 1987; Kelsoe et ah, 1989; Baron et al., 1993). 
The possible localization of BP genes on chromosomes 1 8 (pericentromeric region) 
and 21q has been suggested, but in both cases the proposed candidate region is not 

20 well defined and there is equivocal support for either location (Berrettini et al. (1994) 
Proc. Natl. Acad. Sci. USA, 91 , 5918-5921 , Murray, J.C., etal. (1994) Science 265, 
2049-2054; Pauls etal., Am. J. Hum. Genet. 57:636-643 (1995); Maier et al., Psych. 
Res. 59:7-15 (1995); Straub etal., Nature Genet., 8:291-296 (1994)). Recent 
investigations have led to the isolation of chromosome 18-specific brain transcripts 

25 which have been suggested to be positional candidates for bipolar disorder 
(Yoshikawa et al., Am. J. Med. Gen. 74, 140-149 (1997)). 

Despite abundant evidence that BP has a major genetic component, linkage 
studies have not yet succeeded in definitively localizing a BP gene. This is mainly 
because mapping studies of psychiatric disorders have generally been conducted under 

30 a paradigm appropriate for mapping genes for simple Mendelian disorders, namely, 
using linkage analysis in the expectation of finding high lod scores that definitively 
signpost the location of disease genes. The follow up to early BP linkage studies, 
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however, showed that even extremely high lod scores at a single location can be false 
positives. See Egeland, et al., Nature 325:783-787 (1987); Baron et al., Nature 
326:289-292 (1987); Kelsoe et al., Nature, 342:238-243 (1989); and Baron et al., 
Nature Genet. 3:49-55 (1993). These earlier studies used largely uninformative 
5 markers and did not use stringent criteria for identifying affected individuals. 

Linkage Disequilibrium Analysis 

Linkage disequilibrium (LD) analysis is a powerful tool for mapping disease 
genes and may be particularly useful for investigating complex traits. LD mapping 

10 is based on the following expectations: for any two members of a population, it is 
expected that recombination events occurring over several generations will have 
shuffled their genomes, so that they share little in common with their ancestors. 
However, if these individuals are affected with a disease inherited from a common 
ancestor, the gene responsible for the disease and the markers that immediately 

15 surround it will likely be inherited without change, or IBD ("identical by descent"), 
from that ancestor. The size of the regions that remain shared (i.e. IBD) are 
inversely proportional to the number of generations separating the affected individuals 
and their common ancestor. Thus, "old" populations are suitable for fine scale 
mapping and recently founded ones are appropriate for using LD to roughly localize 

20 disease genes more approximately (Houwen et al., 1994, in particular Fig. 3 and 
accompanying text). Because isolated populations typically have had a small number 
of founders, they are particularly suitable for LD approaches, as indicated by several 
successful LD studies conducted in Finland (de la Chapelle, 1993). 

LD analysis has been used in several positional cloning efforts (Kerem et al., 

25 1989; MacDonald et al., 1992; Petmkhin et al., 1993; Hastbacka et al., 1992 and 
1994), but in each case the initial localization had been achieved using conventional 
linkage methods. Positional cloning is the isolation of a gene solely on the basis of 
its chromosomal location, without regard to its biochemical function. Lander and 
Botstein (1986) proposed that LD mapping could be used to screen the human genome 

30 for disease loci, without conventional linkage analyses. This approach was not 
practical until a set of mapped markers covering the genome became available 
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(Weissenbach et al., 1992). The feasibility of genome screening using LD mapping 
is now demonstrated by the applicants. 

Identification of the chromosomal location of a gene responsible for causing 
severe bipolar mood disorder can facilitate diagnosis, treatment and genetic 
counseling of individuals in affected families. 

Due to the severity of the disorder and the limitations of a purely phenotypic 
diagnosis of BP-I, there is a tremendous need to subtype individuals with BP-I 
genetically to confirm clinical diagnoses and to determine appropriate therapies based 
on their genotypic subtype. 



10 



SUMMARY OF THE INVENTION 
The present invention comprises using genetic linkage and haplotype analysis 
to identify an individual having a bipolar mood disorder gene on the short arm of 
chromosome 18. In addition, the present invention provides markers linked to a gene 

15 responsible for susceptibility to bipolar mood disorder that will enable researchers to 
focus future analysis on that small chromosomal region and will accelerate the 
sequencing of a bipolar mood disorder gene located at 18p. 

The present invention provides, for the first time, a localization of a BP-I 
susceptibility locus to a 300 to 500 kb region of the short arm of chromosome 18. 

20 The present invention is directed to methods of detecting the presence of a 

bipolar mood disorder susceptibility locus in an individual, comprising analyzing a 
sample of DNA for the presence of a DNA polymorphism on the short arm of 
chromosome 18 between SAVA5 and ga203, wherein the DNA polymorphism is 
associated with a form of bipolar mood disorder. The invention includes the use of 

25 genetic markers in the roughly 500 kb region between the SAVA5 locus and the 
ga203 locus, inclusive, to diagnose bipolar mood disorder genetically in individuals 
and to confirm phenotypic diagnoses of bipolar mood disorder. Preferably, the 
sample of DNA is analyzed for the presence of a DNA polymorphism on the short 
arm of chromosome 18 in the roughly 300 kb region between D18S1 140 and W3422. 

30 

In a further embodiment, the invention provides methods of classifying 
subtypes of bipolar mood disorder by identifying one of more DNA polymorphisms 
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located within the 500 kb region between SAVA5 and ga203 loci, inclusive, on the 
short ann of chromosome 18 and analyzing DNA samples from individuals 
phenotypically diagnosed with bipolar mood disorder for the presence or absence of 
one or more of said DNA polymorphisms. Preferably, the sample of DNA is 
5 analyzed for the presence or absence of one or more of said DNA polymorphisms in 
the roughly 300 kb region between D18S1140 and W3422 on the short arm of 
chromosome 18. 

In yet a further embodiment, the methods of the invention include a method 
of treating an individual diagnosed with bipolar mood disorder comprising identifying 

10 one or more DNA polymorphisms located within the 500 kb region of chromosome 
18 between SAVA5 and ga203, analyzing DNA samples from individuals 
phenotypically diagnosed with bipolar mood disorder for the presence or absence of 
one or more of the DNA polymorphisms/and selecting a treatment plan that is most 
effective for individuals having a particular genotype within the 500 kb region of 

15 chromosome 18 between SAVA5 and ga203. Preferably, the sample of DNA is 
analyzed for the presence or absence of one or more DNA polymorphisms in the 
roughly 300 kb region between D18S1140 and W3422 on the short arm of 
chromosome 18. 

20 BRIEF DESCRIPTION OF THE DRAWINGS 

FIG. 1 is a pedigree chart showing two families, CR001 and CR004. 
Affected individuals are denoted by black symbols, deceased individuals by a diagonal 
slash. A schematic of each individual's haplotype (where available) is shown below 
the ID number. Recombinations are denoted by "-x"; consanguineous marriages by 

25 a double bar, and the conserved haplotype as black shading within the haplotype bars. 
The larger conserved region for CR004 is stippled, the larger conserved region for 
CR001 is indicated by a dashed outline. An "I" underneath the haplotype bars 
indicates inferred haplotype. A "?" indicates phase is uncertain. The connection 
between CR001 and CR004, dating to an 18th Century founding couple, is indicated 

30 by the dashed lines joining individuals IH-6 and 1-4. 
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FIG. 2 is a table of lod scores for markers covering the entire human genome 
that exceeded the arbitrary coverage thresholds. Lod scores are shown for two 
markers on chromosome 18: D18S59 and D18S1105. 

5 FIG. 3 depicts the extent of marker coverage used in the pedigree genome 

screening study for each chromosome. Coverage is defined as regions for which a lod 
score of at least 1.6 would have been detected (in the combined data set) for markers 
taily linked to BP-I under the model employed. Areas that remain uncovered (at this 
threshold) are unshaded. Markers for which lod scores were obtained that exceeded 

10 the empirically determined coverage thresholds in CR001, CR004, or the combined 
data set, are shown at their approximate chromosomal location. The symbols to the 
right of the chromosome indicate the thresholds exceeded at that marker: a circle 
signifies that the lod score at a marker exceeded the threshold of 0.8 in CR001, a 
diamond signifies that the lod score exceeded the threshold of 1.2 in CR004, and a 

15 star signifies that the lod score exceeded the threshold of 1.6 in the combined data 
set. 

FIGS. 4A and 4B depicts the Lod score for the maximum likelihood estimate 
of theta in the combined sample for the 473 microsatellite markers typed in the 
20 pedigree genome screen. The MLEs of theta were appointed to the following 
categories: theta < 0.10; 0.10 < theta < 0.40; theta >0.40. Note that the scale 
for the x-axis (distance from pter) changes with chromosomes. 

FIG. 5 is a portion of an integrated map of the 5 cM 18pter region of 
25 chromosome 18. 

FIGS. 6A, 6B and 6C are a list of markers on chromosome 18, with map 
positions noted;"/ ' 

30 FIG. 7 describes 18p allele frequencies for disease chromosomes (aff 105) 

versus nontransmitted chromosomes (ntrans) and samples from a control population 
of Costa Rican students and their parents (control). The name of each marker used 
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in this study is indicated on the left. The second column of numbers refers to allele 
length in base pairs. 

FIG. 8 depicts haplotype analysis of individuals affected with BP-I. The 
5 column labelled 18p refers to the patient identifier, and each patient identifier is 
repeated with 2 rows to indicate allele results with each of the patient's two copies 
of chromosome 18. The columns labelled "PANR" and "MANR" refer to the 
paternal and maternal identifiers, respectively, associated with the particular patient, 
other than 0, 1 and 2, which indicate that parental samples were not available. The 

10 column headings to the right of "PANR" and "MANR" columns represent names of 
specific markers in the 18p region that were used in the haplotype analysis. The 
markers are listed in the order they appear on chromosome 18. The allele length (in 
base pairs) is indicated under the column heading each marker for a particular patient. 
In the column to the immediate right of each marker column, a "1" indicates that the 

15 . phase is known, i.e., that it is known whether a particular allele is inherited from the 
paternal or maternal chromosome, and a "0" indicates that the phase is not definitely 
known. The shaded horizontal bars depict haplotypes of at least three markers which 
include a 154 allele length at D18S59, other than patients 218, 225, 232, 234, 311, 
314 and 458, where the shaded region depicts small sections that do not have the 154 

20 allele at D18S59. The lightly shaded regions depict uncertainty as to whether the 
individual has the affected haplotype, as the phase is not known with certainty. In 
addition, the presence of an allele length of 232 (or 234) with marker ta201 is thought 
to result from a highly mutable allele and may not be distinct from the 230 allele. 
Similarly, the 202 allele at ca212 may not be distinct from the 200 allele at ca212. 

25 Patients 246, 247, 248, 311, 316, 367, 384, 501, 531, 587, 536, 684, 667 and 669 
exhibit a 242, 244, 250, 252 or 214 allele at marker ta201 which indicates a potential 
marker location. Patients 488, 435 and 236 exhibit haplotypes that are distinct from 
the pedigrees that were analyzed. 

30 FIG. 9 depicts haplotype analysis of nontransmitted chromosomes from 

parents of individuals affected with BP-L The labels "ERSN" and "KID" refer to the 
parental and patient identifiers, respectively. As above, allele length is provided in 
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base pairs below each marker with an indication as to whether phase was known (1) 
or not known (0) given to the right of these values. The markers, shading and allele 
characteristics described for Figure 8 also apply to this figure. 

5 

FIG. 10 depicts haplotype analysis of control samples obtained from an 
unscreened population of students of the University of Costa Rica and their parents 
representing the general population. Identifiers are provided in the column headed 
"com", allele length and phase determination given in the remainder of the table. 
10 The markers, shading and allele characteristics described for Figure 8 also apply to 
this figure. Complete data for all markers are not given as indicated by blank boxes, 
or the terms "miss" or "missing". 

FIG- 11 depicts Ancestral Haplotype Reconstruction results in disease 
15 chromosomes. 

DESCRIPTION OF SPECIFIC EMBODIMENTS 
The recent availability of highly polymorphic, genetically mapped markers 
covering the human genome (Weissenbach, J., et al. (1996) Nature 359, 794-801, 

20 Murray, J.C., et al. (1994) Science 265, 2049-2054, Gyapay, G., et al. (1994) 
Nature Genet 7,246-339) has allowed the development of a multi-stage paradigm for 
mapping genes for complex traits. In the first stages, complete genome screening 
(e.g. through lod score analysis) is used to identify possible localizations for disease 
genes. Subsequently, the regions highlighted by the screening study are more 

25 intensively investigated to confirm the initial localizations and delineate clear 
candidate regions. Finally, fine mapping methods (such as haplotype or linkage 
disequilibrium (LD) analysis) or candidate gene approaches are used for positional 
cloning of disease genes. 

Our genome screening study for BP employed the following strategies. Unlike 

30 previous genetic studies of BP, only those individuals with the most severe and 
clinically distinctive forms of BP (BP-I and schizoaffective disorder manic type, SAD- 
M) were considered as affected, rather than including those diagnosed with a milder 
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form of BP (BP-II) or with unipolar major depressive disorder (MDD). Two large 
pedigrees (CR001 and CR004) were selected from a genetically homogeneous 
population, that of the Central Valley of Costa Rica (as described in Escamilla, M.A., 
et ah, (1996) Neuropsychiat. Genet. 67, 244-253, and in Freimer, N.B., et al. (1996) 

5 Neuropsychiat. Genet. 67, 254-263, both incorporated by reference herein). The 
entire human genome was screened for linkage using mapped microsatellite markers 
and a model for genetic analysis in which most of the linkage information was 
derived from affected individuals. The goal of this stringent linkage analysis was to 
identify all regions potentially harboring major genes for BP-I in the study population. 

10 Empirically determined lod score thresholds (using linkage simulation analyses) were 
derived, to suggest regions worthy of further investigation. 

Identification of all suggestive regions and weighing the relative importance 
of findings required complete screening of the genome. The coverage approach was 
developed to gauge the progress of this effort. Conventionally, the thoroughness of 

15 genome screening is evaluated by excluding genome regions from linkage under given 
genetic models. This approach, which is highly sensitive to misspecification of genetic 
models, may be poorly suited for genome screening studies of complex traits; it is 
tied to the expectation of finding linkage at a single locus and demonstrating absence 
of linkage at all other locations in the genome. Additionally, exclusion analyses do 

20 not differentiate between genome regions where linkage is not excluded because 
markers are uninformative in the study population from those in which the genotype 
data are simply ambiguous. In contrast, the coverage approach is designed for studies 
aimed at genome screening rather than for studies where the goal is to demonstrate 
a single unequivocal linkage finding, and it provides explicit data regarding the 

25 informativeness of markers in the study pedigrees. Its use lessens the possibility that 
one would prematurely dismiss a given genome region as being unpromising for 
further study. 

Because the exact genetic length of chromosomes is not clearly established, 
it is impossible to be certain that one has screened the entire genome. Although we 
30 report coverage of about 94% of the genome (under the 90%) dominant model) at the 
thresholds described above, this probably represents an underestimate. The remaining 
coverage gaps in our study occur predominantly at or near telomeres; as the upper 
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bound estimates for the length of each chromosome were used, it is likely that the 
actual coverage gaps in these regions are smaller than our conservative assessment. 

The presence of consistently positive lod scores over a given region was 
considered to be of greater significance than isolated peak lod scores. Such clustering 
5 suggests true co-segregation of markers and phenotypes (i.e. alleles are shared 
identically by descent rather than identically by state) and is more readily observed 
in analyses of a few large pedigrees (as in our study) than in examination of several 
smaller families. The data presented herein indicates clustering of positive lod scores 
in the region of the telomere of 18p. 

10 The genome screen was conducted in two stages. The Stage I screen 

identified areas suggestive of linkage, so that those areas could be saturated with 
available markers, and so that regions, referred to as 'coverage gaps', could be 
pinpointed where markers were insufficiently informative in our sample to detect 
evidence of linkage. The Stage II screen followed up on regions flanking each 

15 marker that yielded peak lod scores approximately equal to or greater than the 
thresholds used for the coverage calculations, which were deemed regions of interest, 
and filled in coverage gaps. The results of the complete genome screen (Stages I and 
II) using 473 markers is described below. 

In addition, linkage disequilibrium analysis of an independently collected 

20 sample of 48 unrelated BP-I patients was initially conducted. These patients were 
from the same ancestral population as the patients in the CR001 and CR004 
pedigrees. The LD analysis was conducted with markers on the short arm of 
chromosome 18 (18p), in a 5 centimorgan (cM) region ("5 cM 18pter region") 
extending from the end of the 18p telomere to a distance of 5 cM along the short arm 

25 of chromosome 18 (18p). The LD analysis gave evidence of LD in this region, 
particularly at marker D18S59 and also at D18S476. LD analysis of further BP-I 
patients from the CRCV with markers in this 5 cM 18pter region was conducted to 
confirm and fine map a BP-I gene in this region. This approach, using additional BP- 
I patients from this CRCV population and additional markers identifies the region of 

30 maximum LD and can precisely localize a BP-I susceptibility gene. 
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Fine mapping of 5 cM 18pter region resulted in the identification of two DNA 
markers (D18S1140 and W3422) defining the boundaries of BP-I as approximately 
300 kb, thus allowing a systematic search for the BP-I gene(s). 

A conservative approach to linkage analysis was used in that almost all of the 
5 information for linkage is derived from individuals with a severe, narrowly defined 
phenotype. While this approach made it very unlikely that lod scores greater than 
conventional thresholds of statistical significance (e.g. >3) would be obtained, it 
provided confidence in the robustness of the most suggestive findings. 

Direct cDNA selection can be used to isolate segments of expressed DNA 
10 from the 300 kb region between D18S1140 and W3422 (M. Lovett, J. Kere, L.M. 
Hinton, Proc. Natl. Acad. Sci. USA 88 9628-9632 (1991); Y^-S. Jou etaL, Genomics 
24 410-413 (1994)). By using bacterial artificial chromosomes (BAC) (e.g., 
commercially available from Research Genetics Inc. Huntsville, Alabama), a group 
of cDNAs can be identified, and hybridization and PCR L amplification experiments can 
15 be used to determine if these cDNA segments are derived from the 300 kb region. 

The cDNAs can then be used to determine whether specific sequences are 
expressed at lower levels (or not at all) in affected individuals compared to non- 
carrier individuals. Measurement of mRNA levels in lymphoblastoid cell lines can 

20 be used as an initial screen. The cell lines are prepared by drawing blood from 
individuals, transforming the lymphoblasts with EBV and growing the immortalized 
cells in culture. Total RNA and DNA are extracted from the cultured human 
lymphoblastoid cell lines. Northern blot hybridization is used to determine reduced 
levels of a specific sequence compared to levels from an unaffected, non-carrier 

25 individual as a result of mutations in the BP-I gene on the chromosomes from these 
affected individuals which results in decreased levels of mature mRNA and play a 
primary role in BP-I. Thus, alterations in gene sequences in affected individuals can 
be determined. 

The polymerase chain reaction (PCR) is used to amplify the gene and to 
30 determine its sequence from affected individuals. Sequence comparison with 
unaffected, non-carrier individuals is carried out to identify polymorphisms in the 
gene sequence that are responsible for BP-I. 
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The identification of the biochemical defect that causes BP-I provides a basis 
for treatments for this disease. In addition, knowledge that certain mutations in the 
gene are responsible for the disease allows mutation detection tests to be used as a 
definitive diagnosis for BP-I. 
5 Thus, the present invention allows the isolation of a nucleic acid molecule that 

can be used in the identification of the presence (or absence) of a mutation in the BP-I 
gene a human and thus can be used in the diagnosis of BP-I or in the genetic 
counseling of individuals, for example those with a family history of BP-I (although 
the general population can be screened as well). In particular, it should be noted that 
10 any mutation in the BP-I gene away from the normal gene sequence is an indication 
of a potential genetic flaw; even so-called "silent" mutations that do not encode a 
different amino acid at the location of the mutation are potential disease mutations, 
since such mutations can introduce into (or remove from) the gene an untranslated 
genetic signal that interferes with the transcription or translation of the gene. Thus, 
15 advice can be given to a patient concerning the potential for transmission of BP-I if 
any mutation is present. While an offspring with the mutation in question may or 
may not have symptoms of BP-I, patient care and monitoring can be selected that will 
be appropriate for the potential presence of the disease; such additional care and/or 
monitoring can be eliminated (along with the concurrent costs) if there are no 
20 differences from the normal gene sequence. As additional information (if any) 
becomes available (e.g., that a given silent mutation or conservative replacement 
mutation does or does not result in BP-I), the advice given for a particular mutation 
may change. However, the change in advice given does not alter the initial 
determination of the presence or absence of mutations in the gene causing BP-I. 
25 Generally, mutations are identified in the human gene for use in a method of 

detecting the presence of a genetic defect that causes or may cause BP-I, or that can 
or may transmit BP-I to an offspring of the human. Initially, the practitioner will be 
looking simply for differences* from the sequence identified as being normal and not 
associated with disease, since any deviation from this sequence has the potential of 
30 causing disease, which is a sufficient basis for initial diagnosis, particularly if the 
different (but still unconfirmed) gene is found in a person with a family history of 
BP-I. As specific mutations are identified as being positively correlated with BP-I (or 
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its absence), practitioners will in some cases focus on identifying one or more specific 
mutations of the gene that changes the sequence of a protein product of the gene or 
that results in the gene not being transcribed or translated. However, simple 
identification of the presence or absence of any mutation in the gene of a patient will 
5 continue to be a viable part of genetic analysis for diagnosis, therapy and counseling. 

The actual technique used to identify the gene or gene mutant is not itself part 
of the practice of the invention. Any of the many techniques to identify gene 
mutations, whether now known or later developed, can be used, such as direct 

10 sequencing of the gene from affected individuals, hybridization with specific probes, 
which includes the technique known as allele-specific oligonucleotide hybridization, 
either without amplification or after amplification of the region being detected, such 
as by PCR. Other analysis techniques include single-strand conformation 
polymorphism (SSCP), restriction fragment length polymorphism (RFLP), enzymatic 

15 mismatch cleavage techniques and transcription/translation analysis. AH of these 
techniques are described in a number of patents and other publications; see, for 
example, "Laboratory Protocols for Mutation Detection" (1996) Oxford University 
Press, Editor: Ulf Landegrun. 

Depending on the patient being tested, different identification techniques can 

20 be selected to achieve particularly advantageous results; For example, for a group 
of patients known to be associated with particular mutations of the gene, 
oligonucleotide ligation assays, "mini-sequencing" or allele-specific oligonucleotide 
(ASO) hybridization can be used. For screening of individuals who are not known 
to be associated with a particular mutation, single-strand conformation polymorphism, 

25 total sequencing of genetic and/or cDNA and comparison with standard sequences are 
preferred. 

In many identification techniques, some amplification of the host genomic 
DNA (or of messenger RNA) will take place to provide for greater sensitivity of 
analysis. In such cases it is not necessary to amplify the entire gene, merely the part 
30 of the gene or the specific location within the gene that is being detected. Thus, the 
method of the invention generally comprises amplification (such as via PCR) of at 
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least a segment of the gene, with the segment being selected for the particular 
analysis being conducted by the diagnostician. 

The patient on whom diagnosis is being carried out can be an adult, as is 
, usually the case for genetic counseling, or a newborn, or prenatal diagnosis can be 
5 carried out on a fetus. Blood samples are usually used for genetic analysis of adults 
or newborns (e.g., screening of dried blood on filter paper), while samples for 
prenatal diagnosis are usually obtained by amniocentesis or chorionic villus biopsy. 

Prior to the present invention, affected individuals were prescribed one drug 
after another until one was found to be effective. As BP-I was diagnosed using 
10 clinical criteria, no correlation between using a particular drug and its efficacy in a 
given case was observed. As a result of the present invention, BP-I subtypes can be 
diagnosed at the molecular level and effective treatment predicted. 

For example, lithium salts, carbamazepine and valproic acid have all been 
prescribed for BP-I affected individuals with serendipitous results. An individual can 
15 now be diagnosed with bipolar mood disorder by analyzing genetic material from that 
individual for the presence or absence of one or more nucleic acid mutations as 
described above. As a result of this diagnosis at the molecular level, an effective 
treatment can be determined by collecting data to obtain a statistically significant 
correlation of a particular treatment with the different subtypes of BP-I. Thus, the 
20 practitioner is able to select a specific drug for the treatment of a particular sub-type 
of BP-I and does not merely rely on trial and error. 

Alternatively, the full-length normal genes for BP-I from humans, as well as 
shorter genes that produce functional proteins, can be used to correct BP-I in a human 
patient by supplying to the human an effective amount of a gene product of the human 
25 gene, either by gene therapy or by in vitro production of the protein followed by 
administration of the protein. It should be recognized that the various techniques for 
administering genetic materials or gene products are well known and are not 
themselves part of the invention. The invention merely involves supplying the genetic 
materials or proteins identified as a result of the present invention in place of the 
30 genetic materials or proteins previously administered. For example, techniques for 
transforming cells to produce gene products are described in U.S. Patent No. 
5,283,185 entitled "Method for Delivering Nucleic Acid into Cells," as well as in 
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numerous scientific articles, such as Feigner et al., "Lipofection: A Highly Efficient, 
Lipid-Mediated DNA-Transfection Procedure," Proc. Natl. Acad. Sci. U.S.A., 84, 
7413*7417 (1987); techniques for in vivo protein production are described in, for 
example, Mueller et al., "Laboratory Methods - Efficient Transfection and Expression 
5 of Heterologous Genes in PG12 Cells," DNA arid Cell Biol, 9(3), 221-229 (1990). 

Administration of proteins and other molecules to overcome a deficiency 
disease is well known (e.g., administration of insulin to correct for high blood sugar 
in diabetes) that further discussion of this technique is not necessary. Some 
modification of existing techniques may be required for particular applications, but 
10 those modifications are within the skill level of the ordinary practitioner using existing 
knowledge and the guidance provided in this specification. 

The invention now being generally described, the following examples are 
provided for purposes of illustration only and are not to be considered to limit the 
invention. 

15 . 

EXAMPLES 

Pedigrees 

Two independently ascertained Costa Rican pedigrees (CR001 and CR004) 
20 were chosen because they contained a high density of individuals with BP-I and 
because their ancestry could be traced to the founding population of the Central 
Valley of Costa Rica. The current population of the Central Valley (consisting of 
about two million people) is predominantly descended from a small number of 
Spanish and Amerindian founders in the 16th and 17th centuries (Escamilla, M.A., 
25 et al., (1996) Neuropsychiat. Genet. 67, 244-253). Studies of several inherited 
diseases have confirmed the genetic isolation of this population (Leon, P., et al. 
(1992) Proc. Natl. Acad. Sci. USA. 89, 5181-5184; Uhrhammer, N., et al. (1992) 
' Am. J. Hum. Genet. 57, 103-1 1 1). An extensive description of pedigrees CR001 and 
CR004 has ben published (Freimer, N.B., et al. (1996) Neuropsychiat. Genet. 67, 
30 254-263). In the course of the study, two links between these pedigrees were 
discovered. However, the families were analyzed separately because these links were 
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discovered after the simulation analyses were completed and after the genome 
screening study had been initiated. 

All available adult members of these families were interviewed in Spanish 
using the Schedule for Affective Disorders and Schizophrenia Lifetime version 
5 (SADS-L) (Endicott, J. et al, (1978) Arch. Gen. Psych. 35, 837-844). Individuals 
who received a psychiatric diagnosis were interviewed again in Spanish by a research 
psychiatrist using the Diagnostic Interview for Genetic Studies (DIGS) (Nurnberger, 
J.L. et al. (1994) Arch. Gen. Psychiat. 51, 849-859). This recently developed 
instrument is similar to, but more detailed than SADS-L. The interviews and medical 
10 records were then reviewed by two blinded best estimators who reached a consensus 
diagnosis. The diagnostic procedures are described in detail in Freimer, N.B., et al. 
(1996) Neuropsychiat. Genet. 67, 254-263 (incorporated by reference herein). 

Unrelated CRCV BP-I Patient Study 

15 BP localizations obtained through the CRCV pedigree studies were confirmed 

by genotyping an independently collected sample of 48 unrelated BP-I patients from 
the CRCV. In this fine mapping LD analysis, 48 unrelated BP-I patients from the 
CRCV were identified and genotyped using microsatellite markers spaced at narrow 
intervals across chromosome 18. As these patients are descended from the same 

20 ancestral population as the patients in the pedigrees previously studied (CR001 and 
CR004), many of them should share disease susceptibility alleles inherited identically 
by descent (IBD) from one or a few common ancestors, and linkage disequilibrium 
(LD) should be present at marker loci surrounding the disease genes. 

The sample of 48 BP-I patients included 25 women and 23 men who were 

25 recruited from psychiatric hospitals and clinics in the CRCV. These patients were 
ascertained only on the basis of diagnosis and CV ancestry, and were not selected on 
the basis of history of BP illness in family members. A structured interview of each 
patient was conducted by a psychiatrist, and medical and hospital records were 
collected. Ascertainment and diagnostic procedures were as described above. 

30 However, in order to lessen further the probability of phenocopies among this 
unrelated sample, for which we lacked pedigree information, the affected phenotype 
was defined even more narrowly than in the pedigree study. Individuals considered 
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affected in this study had to have suffered at least two disabling episodes of mania 
(requiring hospitalization) and a first onset of the illness before age 45. 

Genealogical research on each of the 48 BP-I patients confirmed that on 
average, 70% of their great-grandparents were bom in the CRCV. Individuals whose 
5 great-grandparents were born in the CRCV were considered likely to be descended 
from the original Spanish and Amerindian founders of the CRCV. Genealogical 
research showed that 2 patients are first cousins and the remaining 46 have no 
relationship within the past 4 generations. 

10 Genotyplng Pedigree Studies 

Linkage simulations were used to select the most informative individuals from 
pedigrees CR001 and CR004 for genotyping studies (Freimer, N.B., et al. (1996) 
Neuropsychiat. Genet. 67, 254-263). Under a 90% dominant model, simulation 
analyses with these individuals suggested that evidence of linkage would likely be 

15 detected (e.g. a probability of 92% of obtaining lod > 1.0 in the combined data set) 
using markers with an average heterozygosity of 0.75 spaced at 10 cM intervals (as 
discussed in Freimer, N.B., et al. (1996) Neuropsychiat. Genet. 67, 254-263). For 
the Stage I screen, the most polymorphic markers (307 in total) were chosen, placed 
at approximately 10 cM intervals on the 1992 Genethon map (Houwen, R., et al. 

20 (1992) Nature 359, 794-801). These markers were then supplemented by a small 
number of markers from the Cooperative Human Linkage Center (CHLC) public 
database. For the Stage II screen, 166 markers were added from newer Genethon and 
CHLC maps as they became available (Murray, J.C. et al. (1994) Science 265, 2049- 
2054, Gyapay, G., et al. (1994) Nature Genet. 7,246-339) and from the public 

25 database of the Utah Center for Genome Research, all of which are publicly 
available. DNA samples (from individuals in the CEPH families) that were used for 
size standards for Genethon and CHLC markers were included in the experiments to 
permit comparison of allele sizes between members of the CRCV population and 
individuals in the CEPH database. Genotyping procedures were as described 

30 previously (DiRienzo, A. et al. (1994) Proc. Natl. Acad. Sci. USA 91, 3166-3170 
(incorporated by reference herein)). Briefly, one of the two PCR primers was labeled 
radioactively using a polynucleotide kinase and PCR products were run on 
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polyacrylamide gels. Autoradiographs were scored independently Dy two raters. 
Data for each marker were entered into the computer database twice and the resultant 
files were compared for discrepancies. 

5 Genotyping of Unrelated BP-I CRCV Patients 

Twenty-seven markers were used to genotype all 48 individuals (as well as 53 
individuals used to establish genetic phase) at approximately 5 cM intervals along the 
entire chromosome 18. It was hypothesized that such a screen would permit the 
evaluation of evidence in the 18pter region and also to investigate other regions on 

10 chromosome 18 in which linkage to BP has been suggested by other groups in other 
populations. For each individual, two-marker haplotypes in each of the 26 inter- 
marker intervals were investigated. For 38 of the 48 BP-I patients, genotypes of 
parents or children were available to assist in phase determination. Because of phase 
ambiguities in the remaining 10 individuals, minimal and maximal two-marker 

15 haplotype sharing was evaluated as follows: (1) Minimal: the number of individuals 
(and chromosomes) who definitely shared a chromosomal segment defined by a 
particular pair of alleles (phase known chromosomes) and (2) Maximal: the number 
of individuals (and chromosomes) who could possibly share a chromosomal segment 
defined by a particular pair of alleles (includes phase unknown chromosomes). The 

20 threshold used to identify areas of high IBD sharing of chromosomes in this initial 
screen was designated as maximal sharing of a two-marker haplotype by 50% or more 
of the 48 individuals (or 25% or more of the 96 chromosomes). 

Arbitrary thresholds were designated to identify possible areas of high IBD 
sharing among the 48 patients. Eight of the 26 regions passed this screen. Within 

25 each of these 3 regions, one to three additional markers were typed to permit 
detection of LD, if present, over regions of one to two cM. 

A total of 42 chromosome 18 markers were used to genotype the study 
sample: 

D18S1140, D18S59, D18S476, D18S481 , D18S391 , D18S452, D18S843, D18S464, 
30 D18S1153, D18S378, D18S53, D18S453, D18S40, D18S66, D18S56, D18S57, 
D18S467, D18S460, D18S450, D18S474, D18S69, D18S64, D18S1134, D18S1147, 
D18S60, D18S68, D18S55, D18S477, D18S61, D18S488, D18S485, D18S541, 
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D18S870, D18S469, D18S874, D18S380, D18S1121, D18S1009, D18S844, 
D18S554, D18S461, D18S70 (from pter to qter). Of these 42 markers, four are 
located within the 5 cM 18pter region extending from the telomere of 18p to marker 
D18S481 (inclusive), which is approximately 5 cM from the telomere of 18p. This 
5 region is referred to as the 5 cM 18pter region. The four markers tested in the 5 cM 
18pter region are: D18S59, D18S1140, D18S476 and D18S481. 

For each marker the likelihood that a particular allele (or alleles) is over- 
represented on disease chromosomes, as compared to non-disease chromosomes was 
evaluated. The results of this likelihood test provide a conservative but powerful 
10 measure of LD between two loci. 

Pedigree Statistical Analyses 

Two-point linkage analyses were performed for all markers. Marker allele 
frequencies were estimated from the combined data set with correction for 

15 dependency due to family relationships (Boehnke, M. (1991) Am. J. Hum. Genet. 48, 
22-25). The linkage analyses for Stages I and II included the 65 individuals who 
were genotyped as well as an additional 65 individuals who had been diagnostically 
evaluated but not genotyped. Only individuals with BP-I were considered affected 
with the exception of two persons, one in each family, who carry diagnoses of 

20 schizoaffective disorder manic type (SAD-M). The SAD-M individuals were included 
as affected because BP-I and SAD-M are often difficult to distinguish from each other 
based on their clinical presentation and course of illness (Goodwin, F.K. et al. (1990) 
in Manic Depressive Illness (Oxford University Press, New York), pp. 373-401; 
Freimer, N.B et al. (1993) in The Molecular and Genetic Basis of Neurological 

25 Disease, pp. 951-965; Freimer, N.B. et al. (1996) Neuropsychiat. Genet. 67, 254- 
263; and Freimer, N.B. et al (1996) Nature Genetics 12:436-441, all incorporated by 
reference herein). In all, 20 individuals were designated as affected within CR004 
(Cope k man, J.B., et al. (1995) Nature Genet. 9, 80-85 available for genotyping) and 
10 individuals from CR001 (Kelsoe, J.R. et al. (1989) Nature 342, 238-243 available 

30 for genotyping). The phenotype for all other individuals was designated as unknown 
except for 17 individuals who were designated as unaffected because they had been 
thoroughly clinically evaluated, showed no evidence of any psychiatric disorder, and 
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were well beyond the age of risk (50) for BP-I (linkage simulation studies indicated 
that these unaffected individuals contributed little information to the linkage analysis). 

Linkage analyses were performed using a nearly dominant model (assuming 
penetrance of 0.81 for heterozygous individuals of 0.9 for homozygotes with the 
5 disease mutation). This model was chosen from five different single-locus models 
(ranging from recessive to nearly dominant) due to its consistency with the 
segregation patterns of BP in the two pedigrees and because it had demonstrated the 
greatest power to detect linkage in simulation studies (Freimer, N.B., et al. (1996) 
Neuropsychiat. Genet. 67, 254-263). Based on Costa Rican epidemiological surveys 

10 Escamilla, M.A., et al., (1996) Neuropsychiat. Genet. 67, 244-253, the population 
prevalence of BP-I was assumed to be 0.015 (and thus the frequency of the disease 
allele was assumed to be 0.003)(based on epidemiological surveys in Costa Rica, 
Adis, G. (1992) "Disordenes mentales en Costa Rica: Observaciones 
Epidemiologicas," (San Jose, Costa Rica: Editorial Nacional de Salud y Seguridad 

15 Social)). The frequency of BP-I in individuals without the disease allele was 
conservatively set at 0.01 which effectively specified a population phenocopy rate of 
0.67 (i.e., an affected individual in the general population has a 2/3 probability of 
being a phenocopy). For multiply affected families, the probability that a gene 
segregates is highly increased, which implies that affected individuals in our study 

20 pedigree have a lower probability to be phenocopies than affected individuals in the 
general population, particularly those with several affected close relatives (the exact 
probabilities are dependent on the degree of relationship between patients and the 
number of intervening unaffected individuals). These parameters were chosen to 
ensure that most of the linkage information derives from affected individuals. The 

25 rationale for selecting these parameters and results of analyses that demonstrate the 
conservatism of this model are described by Freimer, N.B., et al. (1996) 
Neuropsychiat. Genet. 67, 254-263. The LINKAGE package (Lathrop et al., (1984) 
Proc. Natl. Acad. Sci. USA 81, 3443-3446) was used for lod score analysis and to 
obtain maximum likelihood estimates of the marker allele frequencies, taking into 

30 account the existing family relationships (see Boehnke, Am. J. Hum. Gent. 48, 22-25 
(1991)). 
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Unrelated BP-I CRCV Patient Statistical Analyses 
A likelihood test of disequilibrium (J. Terwilliger, Am. J. Hum. Genet. 56, 
777 (1995)) was used to estimate a single parameter, lambda, that quantifies the over- 
representation of marker alleles on disease chromosomes as compared to non-disease 
5 chromosomes. We chose this method of analysis over another commonly used 
disequilibrium analysis method, the transmission disequilibrium test (TDT, R. 
Spielman et al., Am. J. Hum. Genet. 52, 506 (1993)) because data from all 48 BP-I 
patients could be used in the likelihood approach. Effective use of the TDT requires 
phase-known, heterozygous parental chromosomes. We do not have parental 

10 genotypes for 20 of the 48 BP-I patients. Simulations indicated that with our data, 
the likelihood test of disequilibrium would be more powerful than the TDT. Lambda 
has been shown to be a superior measure for LD fine mapping, compared to other 
frequently used measures, because it is directly related to the recombination fraction 
between the disease and the marker loci. Non-disease chromosomes were chosen 

15 from the phase-known chromosomes of parents, spouses and children of affected 
individuals, if available. Designation of chromosomes of family members as non- 
disease in a disorder such as BP-I, which is not fully penetrant, necessitates 
specifying a model of disease transmission. The same model of transmission was 
employed in this LD likelihood test as was used in the initial genome screen of the 

20 pedigrees CR001 and CR002 described herein. One parameter was specified 
differently from the genome screen: the phenocopy rate was set to zero in the LD 
likelihood analysis. A phenocopy rate was not specified in the transmission model 
because the effect of phenocopies will be "absorbed" by the lambda parameter, in that 
presence of phenocopies in our sample will serve to erode the association between 

25 marker alleles and disease, and hence reduce the estimate of lambda. 

Coverage 

To access coverage for a marker, the number of informative meioses at the 
estimated recombination fraction was calculated using the estimate of the variance (the 
30 inverse of the information matrix) (Petrukhin, K.E. et al. (1993) Genomics 15, 76- 
85). Alternatively, when the estimated frequency of recombination was close to 0 or 
1 , Edwards' equation was applied to calculate the equivalent number of observations 
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(Edwards, J.H. (1971) Ann. Hum. Genet. 34, 229-250). These meioses represent the 
amount of linkage information provided by the marker, given the pedigree structure 
and the genetic model applied. Linkage to the marker in question was then assumed 
and the lod score that would be observed as a disease gene is hypothetically moved 
5 in increments away from that marker was calculated. All regions around a marker 
that would have generated a lod score that exceeded our thresholds for possible 
linkage (0.8 in CR001 , 1 .2 in CR004, and 1 .6 in the combined data) were considered 
covered. These lod score thresholds were derived from simulation analyses showing 
the expected distribution of lod scores under linkage and non-linkage (Freimer, N.B., 

10 et al. (1996) Neuropsychiat. Genet. 67, 254-263, and approximately represent a result 
that is 250 times more likely to occur in linked simulations than in unlinked 
simulations. Coverage maps were constructed (FIG. 1) by superimposing the regions 
covered by each marker on the genetic map of each chromosome. At the end of the 
Stage II screen, a total of 473 microsatellite markers had been typed with genome 

15 coverage (in the combined data set) of over 94%. Possible coverage gaps are 
indicated by unshaded areas and are mainly concentrated near telomeres. Because the 
coverage calculations make use of marker informativeness within the pedigrees, the 
coverage approach thus permits detection of instances where markers with expected 
high heterozygosities are uninformative in our data set. 

20 

Pedigree Linkage Analysis Results 

Of the 473 microsatellites analyzed with two-point linkage tests, 23 markers 
exceeded the empirically determined thresholds designated for the coverage 
calculations (in either CR001, CR004, or in the combined data set). The location of 

25 these markers, the peak lod scores obtained in each family and in the combined data 
set, and the maximum likelihood estimate of the recombination fraction (0) at which 
these lod scores were observed are indicated in Table 1. The approximate 
chromosomal locations of these markers are also depicted in FIG. 1. The distribution 
of lod scores (for the maximum likelihood estimate of 0 in the combined data set) 

30 across the genome is displayed by chromosome in FIG. 2. 
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The threshold was exceeded for pedigree CR001 in two adjacent markers near 
the 18p telomere (D18S59 and D18S1105), but CR004 displayed no suggestion of 
linkage in this region. 

In the genome screen, the highest lod score observed for family CR001 alone 
5 was at D18S59 (1.32 at 0=0.0), located near pter. All affected members of CR001 
shared alleles at markers in the 18pter region. 

Unrelated BP-I CRCV Patient Study Results 

Out of the forty-two markers tested, eight displayed evidence of over- 
10 representation of a particular allele on disease chromosomes. Eight of the 42 markers 
had -2*ln(Iikelihood ratio) statistics > 1 .0. Three other markers had -2*ln(likelihood 
ratio) statistics >0 and <0.62. The results are shown in Table I: 

Table I 



Marker 


Allele Size 


Frequency on 
non-disease 
Chromosomes 


Frequency on 

Disease 
Chromosomes 


D18S59 


154 


0.121 


0.572 


D18S476 


271 


0.470 


0.771 


D18S467 


172 


0.384 


0.693 


D18S61 


177 


0.074 


0.326 


D18S485 


182 


0.237 


0.586 


D18S870 


179 


0.405 


0.657 


D18S469 


234 ' 


0.128 


0.450 


D18S1121 


168 


0.171 


0.553 



Evidence for association was found at markers located near the telomere of 
the short arm of chromosome 18. D18S59 displayed :;the strongest evidence for 
LD (-2*ln(likelihood ratio) of 8.3, p = 0.002) of all the chromosome 18 markers 
30 tested. An adjacent marker, D18S476 (-2*ln(likelihood ratio) of 1.3), also 

provided evidence of LD. In our genome screening pedigree study we observed 
the single highest lod score for pedigree CR001 of any marker in the entire 
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genome at D18S59. Furthermore, the alleles at D18S59 and D18S476 that are 
over-represented among the BP-I patients from the population sample (154 b.p. 
and 271 b.p. respectively) are observed in all BP-I patients from pedigree CR001. 

5 The LD and pedigree findings in the 5 cM 1 8pter region denote a clearly 

delineated region that contains a BP-I susceptibility locus. This region is distinct 
from other regions on chromosome 18 that have been suggested as linked to mood 
disorder phenotypes (more broadly defined than BP-I). See FIG. 6A, 6B, 6C. In 
contrast to previous reports by Berrettini et al. and Stine et al., suggesting possible 
10 linkage between mood disorder and markers in the pericentromeric region of 
chromosome 18, our results did not show any evidence for association of BP-I 
with any pericentromeric markers (D18S378, D18S53, D18S453 or D18S40). 

Identification Of New Markers From The 5 cM 18pter Region 

15 Cloned human genomic DNA covering the target region is assembled. 

Microsatellite sequences from these clones are identified. A sufficient area around 
the repeat to enable development of a PCR assay for genomic DNA is sequenced, 
and it is confirmed that the microsatellite sequence is polymorphic, as several 
uninformative microsatellites are expected in any set. Several methods have been 

20 routinely used to identify microsatellites from cloned DNA, and at this time no 
single one is clearly preferable (Weber, 1990, Hudson et al., 1992). Most of 
these require screening an excessive number of small insert clones or performing 
extensive subcloning using clones with larger inserts. 

New strategies have recently been developed which permit the use of the 

25 several different microsatellites to be found within a single large insert clone 
without requiring extensive subcloning. A method for direct identification of 
microsatellites from yeast artificial chromosomes (YACs) provides several new 
markers from the target region. This procedure is based on a subtractive 
hybridization step that permits separation of the target DNA from the vector 

30 background. This step is useful because the human DNA (the YAC) constitutes 
only a small proportion of the total yeast genomic DNA. 
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YAC clones (with inserts averaging about 750 Kb of human genomic DNA) 
that span the 5 cM 1 8pter region have already been identified by the - 
CEPH/Genethon consortium (Cohen et al., 1993) and are publicly available. The 
markers from YACs that have been mapped to portions of the candidate region 
5 that are not well represented by currently available markers are first isolated. By 
typing these markers in the families and the "LD" sample, as described above, it 
is possible to narrow the candidate region, perhaps to a size of less than one to 
two cM, thus permitting limitation of the segment in which more extensive 
mapping efforts are applied. 
10 Briefly, the microsatellite identification procedure is performed as follows: 

A subtractive hybridization is performed using genomic DNA from a target YAC 
together with an equivalent amount of a control DNA. This procedure separates 
the YAC DNA from that of the yeast vector. Following the subtraction procedure 
the subtracted YAC DNA is purified, digested with restriction enzymes and cloned 
15 into a plasmid vector (Ostrander et al., 1992). The cloned products of each YAC 
are screened using a CA(15) oligonucleotide probe (i.e. an oligonucleotide having 
15 CA repeats). Each positive clone (i.e. those that contain TG-repeats) is 
sequenced to identify primers for PCR to genotype the BP-I samples. 

An alternative approach, based on using a set of degenerate sequencing 
20 primers that anneal directly to the repeat sequence, permitting direct thermal cycle 
sequencing (Browne & Litt, 1992), can also be used. 

Once the candidate region is narrowed to a size of less than about 500 to 
1000 Kb, a contiguous array (contig) of clones with smaller inserts than YACs, 
mainly PI clones, is developed. PI clones are phage clones specially designed to 
25 accommodate inserts of up to 100 Kb (Shepherd et al., 1994). 

Development Of A Physical Map Of The 5 cM 18pter Region 
In parallel with the genetic mapping, a physical map of the 5 cM 18pter 

region is developed. The backbone of this effort is the assembly of contigs of 
30 large insert clones. Low resolution contigs for most of the human genome are 

already available using the YACs developed by CEPH (Cohen et al., 1993). 

Although these have been individually verified and checked for overlap with other 
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YACs, there is a high rate of chimerism in the YACs and insufficient evidence to 
definitively confirm the order of the YACs. In addition, because of their large 
size these YACs are particularly cumbersome to work with. Nevertheless, they 
provide a useful framework to start constructing high resolution contigs. 
5 Once a candidate region of less than about five cM is delineated, the 

studies to develop a physical map are commenced. Because of the disadvantages 
of relying solely on YACs, and because positional cloning is facilitated by the 
availability of a higher resolution map, contigs are generated using PI clones once 
the candidate region is narrowed to less than one Mb, by LD mapping in the 

10 expanded population sample using the new markers identified from the YACs. 

Once a region of 500-1000 Kb or less is defined, physical mapping and 
cloning are computed using PI clones rather than YACs, and PI contigs over such 
a region are constructed. The Pis are used to identify additional markers for the 
further positional cloning steps as well as the screening for rearrangements. 

15 The starting point of contig construction is the microsatellite sequences and 

non-polymorphic STSs that derive from the few YACs that surround the 
genetically determined candidate region. These STSs are used to screen the PI 
library. The ends of the Pis are cloned using inverse PCR and used to order the 
Pis relative to each other. Amplification in a new PI will indicate that it overlaps 

20 with the previous one. Fluorescent in situ hybridization (FISH) permits ordering 
of the majority of the Pis (Pinkel, 1988; Lichter, 1991). The original set of Pis 
serves as building blocks of the complete contig; each end clone is used to re- 
screen the library and in this way Pis are added to the map. 

From each PI additional microsatellites are identified as previously 

25 described. This allows further reduction of the candidate region. When the 
region is narrowed to less than one Mb in size, positional cloning efforts are 
initiated. 

Fine Mapping of 5cM 18PTER Region 

In order to delineate further regions of BP-I susceptibility within the 5 cM 
30 18pter region, additional unrelated BP-I patients from the CRCV as well as other 
populations can be diagnosed and genotyped both with the markers described 
herein as well as additional markers in the 5 cM 1 8pter region that are known as 
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well those yet to be identified. Additional markers are available from the 
Cooperative Human Linkage Center (CHLC) public database, from newer 
Genethon and CHLC maps as they become available (Murray, J.C. et al. (1994) 
Science 265, 2049-2054, Gyapay, G., et al. (1994) Nature Genet. 7,246-339) and 

5 from the public database of the Utah Center for Genome Research (all of which 
are incorporated by reference herein). The web addresses for Genethon and 
CHLC are: Genethon (http://www.genethon.fr/genethon_en.html), CHLC 
(http://gopher.chlc.org/HomePage.html). These databases are all linked, and one 
of ordinary skill in the art can readily access the information available from these 

10 databases. 

The markers shown in FIG. 6A, from number 1 to 22 or 23 can be used to 
genotype the CRCV pedigrees and unrelated BP-I patients described herein as well 
as other BP-I affected individuals and pedigrees. See FIG. 6A (portion of a 
chromosome 18 map available from the Whitehead Institute, web address: 

15 http://133. 30. 8. l:8080/=@ = : www-genome, wi. mit.edu. (incorporated herein by 

reference)). The fine mapping techniques described herein in conjunction with the 
teachings regarding the 5 cM 18pter region can be used to narrow the BP-I 
susceptibility region further. 

The following markers (listed in order of occurrence from the telomere 

20 towards the centromere) were used to delineate regions of BP-I susceptibility 
within the 5 cM 18pter region: SAVA5, ca21 1, ca212, D18S1140, D18S59, 
ca231, ta201, AT201, ca225, w3442, ca213, ga201, ga203, ca219, D18S1105, 
ca209, ca202, D18S1146, GATA (referred to in the figures as 166d05) and 
D18S476. The markers SAVA5, D18S1140, D18S59, ta201, at201, w3442, 

25 ga201, ga203, D18S1105, D18S1146, GATA and D18S476 were used in both the 
haplotype analysis (Figure 8) and the AHR analysis (Figure 1 1) to delineate the 
BP-I susceptibility locus to the 500 kb region defined by the markers SAVA5 and 
ga203 and the 300 kb region defined by D18S1 140 and W3422. The other 
markers were used in both haplotype and the AHR analyses as confirmatory 

30 evidence for the localizations. Blood samples from 105 affected individuals were 
tested for the presence of marker haplotypes and compared to marker haplotypes 
detected on the non-transmitted chromosome in samples obtained from the 
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parent(s) of the affected individuals when available (71 cases) or to markers 
detected in samples obtained from a control population of students attending the 
University of Costa Rica (52 samples). The non-transmitted chromosomes are 
well matched as controls allowing the affected haplotype of the transmitted 
5 chromosome to be more easily discerned than through comparison with data 

obtained from the general population that may contain individuals who carry the 
affected haplotype but do not exhibit clinical symptoms of bipolar mood disorder. 

Figure 7 provides 18p allele frequencies for disease (aff 105) versus 
10 nontransmitted (ntrans) chromosomes and samples from the control population of 
students (control). The name of each marker used in this study is indicated on the 
left. The second column of numbers refers to allele length in basepairs. This data 
provides evidence of over-representation of a particular allele on disease 
chromosomes. 

15 Figure 8 summarizes the results obtained with affected individuals. The 

column labelled 18p refers to the patient identifier, and each patient identifier is 
repeated to indicate results with both copies of chromosome 18. The labels 
"PANR" and "MANR" refer to the paternal and maternal identifier, respectively, 
associated with the particular patient, other than 0, 1 and 2, which indicate that 

20 parental samples were not available. The allele length (base pairs) is indicated 
under each marker for a particular patient; the length of the horizontal bar in the 
figure reflects whether haplotypes are EBD or IBS, with IBD haplotypes with 
common ancestors having longer bars than randomly inherited IBS haplotypes. To 
the right of each marker, a "1" indicates that the phase is known, i.e., that it is 

25 known whether a particular allele is inherited from the paternal or maternal 
chromosome, and a "0" indicates that the phase is not known for sure. The 
determination of phase allows the practitioner to conclude that marker alleles are 
linked in a haplotype on the same disease causing chromosome. 

Figure 9 provides similar data for non-transmitted chromosomes 

30 obtained from parental samples. Some individuals exhibited the affected haplotype 
indicating that the parent was homozygous; however, these regions of identity 

-29- 

SUBSTITUTE SHEET (RULE 26) 

BNSDOCID: <WO 9807887A1 JA> 



WO 98/07887 



PCT/US97/14892 



were typically much shorter than those regions observed in affected individuals, 
indicating that they were IBS. 

Figure 10 similarly provides data for an unscreened population of 
students from the University of Costa Rica and their parents (52 samples in total). 
5 The data demonstrate that there is a lower incidence of the affected haplotype in 
the general population as compared with Figure 8 and that the affected haplotype 
is typically shorter similar to the results obtained with non-transmitted 
chromosomes. However, the results for the general population is less distinctive 
than that observed for non-transmitted chromosomes in allowing one to map the 
10 affected haplotype. 

Comparison of the affected haplotype with non-transmitted chromosome 
markers indicate that the region of maximal sharing between affected individuals 
occurs between 1140t and w3442 on chromosome 18. This region encompasses 
approximately 300 kb. 
15 The data was analyzed further using Ancestral Haplotype Reconstruction 

(AHR), a likelihood method for measuring LD. Data from affected individuals 
are examined in 2-marker segments. Within each segment, the multinomial 
likelihood of each of the possible ancestral haplotypes giving rise to the observed 
sample of disease haplotypes is calculated. This likelihood is calculated assuming 
some fraction, a, of disease chromosomes are associated with this 2-marker 
segment, and (1-or) are linked to this segment. These haplotype likelihoods are 
weighted by the probability of observing that haplotype in the population, and 
summed to create an overall likelihood for the 2-marker segment. This segment 
likelihood is compared to the null likelihood, which assumes the disease and 
25 markers are unlinked (and therefore a=0), and a LOD score is generated. The 
LOD score is maximized over the parameter a. Details of these calculations are 
presented in Appendix A. The results of this analysis are shown in Figure 1 1 . 
The percentages given above the diagonal line demarcated by the filled boxes 
indicate the percentage of disease chromosomes hypothesized to be true 
30 chromosomes from a common founder. For example, 17% of chromosomes 

obtained from affected individuals have the 18S59 to W3442 region; i.e., as each 
individual has two chromosome copies, 34% of individuals have this region. The 
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25 



number above each percentage indicates the LOD score. The numbers given 
below the diagonal line demarcated by the filled boxes indicate the alleles inherited 
from a common founder, with the number prior to the dash indicating the allele of 
the marker identified in the column heading and the number following the dash 
indicating the allele of the marker identified in the row heading. The marker 
alleles are referred to as follows: 





MARKER 


ff 


ALL 




SAVA5 


2 


229 


10 


CA21 1 


3 


195 




18S1140 


2 


268 




18S59 


4 


154 




18S59 


6 


158 




TA201 


2 


220 


15 


TA201 


3 


230 




CA231 


2 


186 




CA231 


4 


202 




AT201 


1 


170 




AT201 


2 


178 


20 


CA225 


1 


160 




CA225 


3 


172 




W3442 


1 


10 



Blank boxes indicate no positive evidence for linking the indicated region to the 
affected chromosome. 



Use Of PI Clones To Identify Candidate cDNAs For Screening For 
Mutations In The DNA Of BP-I Patients 



The PI clones described above are used to identify candidate cDNAs. The 
30 candidate cDNAs are subsequently screened for mutations in DNA from BP-I 
patients. From the minimal candidate region defined by genetic mapping 
experiments a segment is left that is sufficiently large to contain multiple different 
genes. 
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Identification Of Coding Sequences 

Coding sequences from the surrounding DNA are identified, and these 
sequences are screened until a probable candidate cDNA is found. Much of the 
human genome will be sequenced over the next few years, in which case it may 
5 become feasible to identify coding sequences through database screening. 

Candidates may also be identified by scanning databases consisting of partially 
sequenced cDNAs (Adams et al., 1991), known as expressed sequence tags, or 
ESTs. These resources are already largely developed, and include upwards of 
100,000 cDNAs, the majority expressed primarily in the brain. It is not yet clear, 

10 however, that the complete set of cDNAs will be mapped to specific chromosomal 
locations in the near future, and that their data will soon be made publicly 
available. The database can be used to identify all cDNAs that map to the 
minimal candidate region for BP-L These cDNAs are then used as probes to 
hybridize to the PI contig, and new microsatellites are isolated, which are used to 

15 genotype the "LD" sample. Maximal linkage disequilibrium in the vicinity of one 
or two cDNAs is identified. These cDNAs are the first ones used to screen 
patient DNA for mutations. Database screening has already been used to identify 
a gene responsible for familial colon cancer (Papadopolous et al., 1993). 

Coding sequences are also identified by exon amplification (Duyk et al., 

20 1990; Buckler et al., 1991). Exon amplification targets exons in genomic DNA by 
identifying the consensus splice sequences that flank exon-intron boundaries. 
Briefly, exons are trapped in the process of cloning genomic DNA (e.g. from Pis) 
into an expression vector (Zhang et al., 1994). These clones are transfected into 
COS cells, RT-PCR is performed on total or cytoplasmic RNA isolated from the 

25 COS cells using primers that are complementary to the splicing vector. Exon 

amplification is tedious but routine; for example, the system developed by Buckler, 
et al. (1991). This method is probably preferable to another widely used 
approach, direct selection, which involves screening cDNAs using large insert 
clone contigs, with several steps to maximize the efficiency of hybridization and 

30 recovery of the appropriate hybrid (Lovett et al., 1991). Although direct selection 
is more efficient than exon amplification (Del Mastro et al., 1994), it may not be 
practical as it depends on the candidate cDNA being expressed in the tissue from 
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which the cDNA library was made; there is no prior information to indicate the 
tissue or developmental stage in which BP-I genes would be expressed. 

Once cDNAs are identified the most plausible candidates are screened by 
direct sequencing, SSCP or using chemical cleavage assays (Cotton et al. 1988). 
5 The data are also evaluated for clues to the possible identity or mode of 

action of BP-I mutations. For example, it is known that trinucleotide repeat 
expansion is associated with the phenomenon of anticipation, or the tendency, for a 
phenotype to become more severe and display an earlier age of onset in the lower 
generations of a pedigree (Ballabio, 1993). Several investigators have suggested 

10 that segregation patterns of BP-I are consistent with anticipation (Mclnnis et al., 
1993; Ny lander et al., 1994). The apparent transmission of BP-I, in association 
with the conserved 18q23 haplotype is constant with anticipation. Therefore, once 
the candidate region is narrowed to its minimal extent, the PI clones are screened 
using trinucleotide repeat oligonucleotides (Hummerich et al., 1994). A PCR 

15 assay is developed and patient DNAs are screened for expanded alleles. 

Genetic and physical data help to map the bipolar mood disorder gene to 
the 5 cM 18pter region of chromosome 18. New markers from this region are 
tested in order to locate the bipolar mood disorder gene in a region small enough 
to provide higher quality genetic tests for bipolar mood disorder, and to 

20 specifically find the mutated gene. Narrowing down the region in which the gene 
is located will lead to sequencing of the bipolar mood disorder gene as well as 
cloning thereof. Further genetic analysis employing, for example, new 
polymorphisms flanking D18S59 and D18S476 as well as the use of cosmids, yeast 
artificial chromosome (YAC) clones, or mixtures thereof, are employed in the 

25 narrowing down process. The next step in narrowing down the candidate region 
includes cloning of the chromosomal region 18pter including proximal and distal 
markers in a contig formed by overlapping cosmids and YACS. Subsequent 
subcloning in cosmids, plasmids or phages will generate additional probes for 
more detailed mapping. 

30 The next step of cloning the gene involves exon trapping, screening of 

cDNA libraries, Northern blots or rt PCR (reverse transcriptase PCR) of samples 
from affected and unaffected individuals, direct sequencing of exons or testing 

-33- 

SUBSTITUTE SHEET (RULE 26) 

BNSDOCID: <WO 9607887A1_IA> 



WO 98/07887 PCTYUS97/14892 

exons by SSCP (single strand conformation polymorphism), RNase protection or 
chemical cleavage. 

Flanking markers on both sides of the bipolar mood disorder gene 
combined with D18S59 and D18S476 or a number of well-positioned markers that 

5 cover the chromosomal region (5 cM 18pter) carrying the disease gene, can give a 
high probability of affected or non-affected chromosomes in the range of 80-90% 
accuracy, depending on the informativeness of the markers used and their distance 
from the disease gene. Using current markers linked to bipolar mood disorder, 
and assuming closer flanking markers will be identified, a genetic test for families 

10 with bipolar mood disorder will be for diagnosis in conjunction with clinical 

evaluation, screening of risk and carrier testing in healthy siblings. In the future, 
subsequent delineation of closely linked markers which may show strong 
disequilibrium with the disorder, or identification of the defective gene, could 
allow screening of the entire at-risk population to identify carriers, and provide 

15 improved treatments. 

Treatment of BP-I Patients Using Genotype Data 
Using the fine mapping techniques described herein, BP-I susceptibility loci 
or genes in the 5 cM 18pter region in particular in the region ti\ between SAVA5 

20 and ga203, are identified and used to genotype patients diagnosed phenotypically 
with BP-I. Preferably, markers in the roughly 500 kb region defined by SAVA5 
and ga203, inclusive, are used. More preferably, markers in either the region 
defined by D18S59 and w3422, inclusive, are used. 

Genotyping with the markers described herein as well as additional markers 

25 permits confirmation of phenotypic BP-I diagnoses or assist with ambiguous 

clinical phenotypes which make it difficult to distinguish between BP-I and other 
possible psychiatric illnesses. A patient's genotype in the 5 cM 18pter region is 
determined and compared with previously determined genotypes of other 
individuals previously diagnosed with BP-I. Once an individual is genotyped as 

30 having a BP-I susceptibility locus in the 5 cM 18pter region, the individual is 
treated with any of the known methods effective in treating at least certain 
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individuals affected with BP-I, such as the administration of lithium salts, 
carbamazepine or valproic acid. 

Studies are conducted correlating effective treatments with BP-I genotypes 
in the 5 cM 18pter region to determine the most effective treatments for particular 
5 , genotypes. BP-I patients can then be genotyped in the 5 cM 18pter region and the 
statistically most effective treatment can be determined as a first course of therapy. 

All publications and patent applications mentioned in this specification are 
herein incorporated by reference to the same extent as if each individual 
publication or patent application was specifically and individually indicated to be 
10 incorporated by reference. 

The invention now being fully described, it will be apparent to one of 
ordinary skill in the art that many changes and modifications can be made thereto 
without departing from the spirit or scope of the appended claims. 
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Appendix A 

Consider the original mutation to have occurred on a chromosomal segment be- 
tween two markers A and B. At the time the mutation was introduced, there were 
n a alleles at marker A and n„ alleles at marker B. On the chromosome containing 
the disease mutation both marker A and marker B carried allele X. The probabil- 
ity that after g generations an affected individual carrying the original disease 
mutation would still have allele X at markers A and B is: 

(i-e 1 ) 8 (i-e,y + (i-e 1 y(i-(i-e 2 )«)f(x B ) + (i-(i-0 1 ) s xi-e^f(x A ) + 
e ^ W (Ki-e^(i-0-e 2 )Wjf(x B ) 

where 9, is the recombination fraction between disease and marker A, Gjis the re- 
combination fraction between disease and marker B, g is the number of genera- 
tions since founding (i.e. since the mutation was introduced into the population), 
f(X,J is the population frequency of the X-allele at marker A and f(X 8 ) is the 
population frequency of the X-allele at marker B. This equation includes terms 
for the possibility of recombination between the markers and the disease locus, 
with the X-allele at the markers then being identical by state (IBS) rather than 
IBD. The probabilities of an affected individual with the original mutation having 
other haplotypes can be formulated similarly. The probability of having allele Z 
at marker B (where Z is any allele at marker B besides X) would be: 
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(i-e,) ! (i-(i-e 2 ) s )f(z B ) + (l-ci-e.yxi-d-e.mxjfcze) 

eq(2) 

where f(2 B ) is the frequency of allele Z at marker B in the population. The prob- 
ability of having allele Z at marker A (where Z is any allele at marker B besides 
X) would be : 

(i-e^o-o-e.mzj + (i-(i-eo<)(i-(i-e 2 )Of(x B )f(zj 

eq(3) 

where f(ZJ is the frequency of allele Z at marker A in the population. Finally, 
the probability of having allele Z at both markers A and B would be: 

(i-(i-e,) s )(i-(i-e 2 ) g )f(zjf(z B ) 

eq(4) 

These probabilities assume (1) no interference in recombination and (2) the same 
marker alleles are present now as were present g generations ago, in similar fre- 
quencies. If, for example, marker A has n„ alleles and marker B has n b alleles, 
then these probabilities form a (n a ).( n 4 ) by n h ) transition matrix, with row i 
containing the probabilities that founder haplotype i gave rise to each of the (n„).( 
n„) different haplotypes in g generations. The rows of this transition matrix sum 
to 1. 
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In simulations, the haplotype frequencies in the disease population were formu- 
lated using these transition probabilities, assuming the disease arose on a haplo- 
type with the "1" allele at each of the two markers. 

Once these transition probabilities are estimated, the likelihood of a particular 
founder chromosome giving rise to the observed sample of disease haplotypes in 
g generations is easily estimated. For example, if one assumed that the disease 
mutation arose on a chromosome with the X-allele at both markers, the likelihood 
(L xoc ) that this chromosome was the founder of the present-day sampled disease 
chromosomes is given by the multinomial: 

K 

L x-x =Tl(Px-x,Y' 
1=1 

eq(5) 

where / indexes the K potential haplotypes for the two markers (K =(nj( nj), p x , Xi 
is the probability that the ancestral disease chromosome with the X-allele at 6'oth 
markers gave rise to a haplotype of type / in g generations (taken from the transi- 
tion matrix), and Y, is the observed number of haplotype / in the sample 
(£i(Yj)=the number of chromosomes in the sample to be analyzed). The likeli- 
hood in eq (5) assumes that all affected individuals are independent. While, after 
many generations of separation from a common ancestor one might consider these 
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individuals to be independent, they are in fact related through a complex and un- 
known pedigree. The simplification of considering individuals as independent 
makes the likelihood much more tractable to compute. 

The K likelihoods are then summed, and weighted by the probability of observing 
that particular haplotype in the population to produce an overall likelihood for the 
2-marker segment: 

eq (6) l= , 

where/ is the frequency of haplotype i in the population. This overall likelihood 
calculation parallels the approach taken by Terwilliger (1995, eq (2)). The 
haplotype frequencies are estimated from the sample of normal chromosomes. In 
the event that the haplotype resulting in the largest contribution to the overall 
likelihood in eq (6) is not observed in the normal sample, the upper 95% confi- 
dence interval for this frequency is used, and the remaining haplotype frequencies 
rescaled accordingly. 

This overall likelihood is compared to the null likelihood, which is generated in 
exactly the same manner, except that it is assumed the markers were unlinked to 



-39- 

SUBSTITUTE SHEET (RULE 26) 



BNSDOCID: <WO 9807887 A 1. 1 A> 



WO 98/07887 



PC17US97/14892 



the disease locus (e,=9 2 =0.5 in, for example, eqs (1-4)). The log I0 of this likeli- 
hood ratio is a LOD score. One might consider to use in the null likelihood tran- 
sition probabilities calculated under the assumption of linkage equilibrium. Under 
this null the cells of the transition matrix are computed by multiplication of allele 
frequencies, assuming independence of marker loci. These two forms of the null 
likelihood are equivalent in value for g of approximately 20 or greater, and for 
g<20 the values are nearly equivalent. 

Because G, and 0 2 are obviously unknown, the putative disease locus is set to be in 
the middle of the segment and therefore 9, and 6 3 are one-half the genetic distance 
(converted to recombination fraction by the Haldane mapping function, (Ott 
1991)) between the two marker loci forming the segment. In fact, one could esti- 
mate 6, and G 2 , or their ratio, and the method could easily be modified to do so, 
however for our purposes finding a linked segment is suitable. 

This basic procedure has been modified to deal with heterogeneity in the sample 
of disease chromosomes. Not all chromosomes in the disease sample may be true 
disease chromosomes from a common founder. Individuals heterozygous for the 
disease mutation will add one chromosome to the disease sample that will not be a 
true disease chromosome. Additionally, affected individuals not linked to the 
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particular chromosomal segment being analyzed (either because they are pheno- 
copies or because of locus heterogeneity) will contribute two chromosomes to the 
affected sample that do not harbor this disease locus. When the null hypothesis of 
no linkage is not true, some fraction, a, of the chromosomes in the disease sample^ 
will associated with this chromosomal segment, and (1-a) will not be associated. 
We decided to examine a in steps of 0. 1 , from 1 .0 to 0.0, and for each step in a 
produce a new transition matrix under the alternative hypothesis and calculate a 
LOD score. If we call the transition matrix calculated under the alternative hy- 
pothesis (where the disease locus is hypothesized to be in the middle of the 2- 
marker segment) T a and call the transition matrix calculated under the null hy- 
pothesis (where the disease locus is unlinked to the 2-marker segment) T m9 then a 
new transition matrix for the alternative hypothesis is calculated as: 

T' 0 =a7 fl +(l-a)7; 

eq(7) 

The transition matrix under the null uses a=0. The LOD score is then maximized 
over the one parameter a. 
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WHAT IS CLAIMED IS : 

1 . A method of detecting the presence of a bipolar mood disorder 
susceptibility locus in an individual comprising: 
5 analyzing a sample of DNA from said individual for the presence of a 

DNA polymorphism on the short arm of chromosome 18 between SAVA5 and 
ga203, wherein said DNA polymorphism is associated with a form of bipolar 
mood disorder. 

10 2. The method of claim 1, wherein said DNA polymorphism is located on the 
short arm of chromosome 18 between D18S1140 and ga203, inclusive. 

3. The method of claim 1, wherein said DNA polymoiphism is located on the 
short arm of chromosome 18 between SAVA5 and W3422, inclusive. 

15 

4. The method of claim 1, wherein said DNA polymorphism is located on the 
short arm of chromosome 18 between D18S1140 and W3422, inclusive. 

5. The method of claim 1, wherein said DNA polymorphism is located on the 
20 short arm of chromosome 18 between D18S1140 and at201, inclusive. 

6. The method of claim 1, wherein said DNA polymorphism is located on the 
short arm of chromosome 18 between D18S1140 and ta201, inclusive. 

25 7. The method of claim 1, wherein said DNA polymorphism is located on the 
short arm of chromosome 18 between D18S59 and ta201, inclusive. 
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8. The method of claim 1, wherein said analyzing further comprises: 

a. obtaining DNA samples from family members of said individual, 

b. analyzing said DNA samples from family members for the presence of 
said DNA polymorphism, and 

5 c. correlating the presence or absence of the DNA polymorphism with 

a phenotypic diagnosis of bipolar mood disorder for said individual and for said 
family members. 

9. A method for detecting the presence of a DNA polymorphism linked to a 
10 gene associated with bipolar mood disorder in an individual comprising: 

a. typing blood relatives of said individual for a DNA polymorphism 
located within a 500kb region of chromosome 18, wherein said region is located 
between SAVA5 and ga203, inclusive, 

b. analyzing a DNA sample from said individual for the presence of 
15 said DNA polymoiphism. 

10. A method of genetically diagnosing bipolar mood disorder in an individual 
comprising: 

a. obtaining a DNA sample from said individual, 
20 b. analyzing said DNA sample for the presence of a DNA 

polymorphism associated with bipolar mood disorder, wherein said DNA 
polymorphism is located within a 500 kb region of chromosome 18, wherein said 
region is located between SAVA5 and ga203, inclusive. 

25 11. - . A method of confirming a phenotypic diagnosis of bipolar mood disorder in 
an individual comprising: 

a. obtaining a DNA sample from said individual, 

b. analyzing said DNA sample for the presence of a DNA 
polymorphism associated with bipolar mood disorder, wherein said DNA 

30 polymorphism is located within a 500 kb region of chromosome 18, wherein said 
region is located between SAVA5 and ga203, inclusive. 

•«. 
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12. The method of claim 10, wherein said individual has Spanish or 
Amerindian ancestry. 

13. A method of classifying subtypes of bipolar mood disorder comprising: 

5 a. identifying one or more DNA polymorphisms located within a 500 

kb region of chromosome 18, wherein said region is located between SAVA5 and 
ga203, inclusive; and 

b. analyzing DNA samples from individuals phenotypically diagnosed 
with bipolar mood disorder for the presence or absence of one of more of said 

10 DNA polymorphisms. 

14. A method of treating an individual diagnosed with bipolar mood disorder 
comprising: 

a. identifying one or more DNA polymorphisms located within a 500 
15 kb region of chromosome 18, wherein said region is located between SAVA5 and 

ga203, inclusive; and 

b. analyzing DNA samples from individuals phenotypically diagnosed 
with bipolar mood disorder for the presence or absence of one of more of said 
DNA polymorphisms, and 

20 c. selecting a treatment plan that is most effective for individuals 

having a particular genotype within said 500 kb region of chromosome 18. 

15. An isolated polynucleotide capable of selectively hybridizing with a DNA 
sample from an individual phenotypically diagnosed with severe bipolar mood 

25 disorder, wherein said polynucleotide does not selectively hybridize with a DNA 
sample from an individual not affected by severe bipolar mood disorder, wherein 
said isolated polynucleotide selectively hybridizes with a complementary 
polynucleotide within a 500 kb region of chromosome 18, wherein said region is 
located between SAVA5 and ga203, inclusive. 

30 
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16. The isolated polynucleotide of claim 15, wherein said complementary 
polynucleotide is within a 500 kb region of chromosome 18, between SAVA5 and 
ga203, inclusive. 
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